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CHAPTER 1 
Logic 


The main subject of Mathematical Logic is mathematical proof. In this 
introductory chapter we deal with the basics of formalizing such proofs. The 
system we pick for the representation of proofs is Gentzen’s natural deduc- 
tion, from [8]. Our reasons for this choice are twofold. First, as the name 
says this is a natural notion of formal proof, which means that the way proofs 
are represented corresponds very much to the way a careful mathematician 
writing out all details of an argument would go anyway. Second, formal 
proofs in natural deduction are closely related (via the so-called Curry- 
Howard correspondence) to terms in typed lambda calculus. This provides 
us not only with a compact notation for logical derivations (which other- 
wise tend to become somewhat unmanagable tree-like structures), but also 
opens up a route to applying the computational techniques which underpin 
lambda calculus. 

Apart from classical logic we will also deal with more constructive logics: 
minimal and intuitionistic logic. This will reveal some interesting aspects of 
proofs, e.g. that it is possible und useful to distinguish beween existential 
proofs that actually construct witnessing objects, and others that don’t. As 
an example, consider the following proposition. 


There are irrational numbers a,b such that a? is rational. 


This can be proved as follows, by cases. 


Case Var? is rational. Choose a = /2 and b = V2. Then a,b are 
irrational and by assumption a? is rational. 


Case Ja? is irrational. Choose a = Jr? and b = V2. Then by 
assumption a,b are irrational and 


ab = (V2) = (v2)’ =2 


is rational. 


As long as we have not decided whether Pha is rational, we do not 
know which numbers a,b we must take. Hence we have an example of an 
existence proof which does not provide an instance. 

An essential point for Mathematical Logic is to fix a formal language to 
be used. We take implication — and the universal quantifier V as basic. Then 
the logic rules correspond to lambda calculus. The additional connectives 1, 
4d, V and A are defined via axiom schemes. These axiom schemes will later 
be seen as special cases of introduction and elimination rules for inductive 
definitions. 
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1. Formal Languages 


1.1. Terms and Formulas. Let a countable infinite set { v; |i ¢ N} 
of variables be given; they will be denoted by x,y, z. A first order language 
£ then is determined by its signature, which is to mean the following. 


e For every natural number n > 0 a (possible empty) set of n-ary rela- 
tion symbols (also called predicate symbols). 0-ary relation symbols 
are called propositional symbols. L (read “falsum”) is required as 
a fixed propositional symbol. The language will not, unless stated 
otherwise, contain = as a primitive. 

e For every natural number n > 0 a (possible empty) set of n-ary 
function symbols. O-ary function symbols are called constants. 


We assume that all these sets of variables, relation and function symbols are 
disjoint. 

For instance the language £¢ of group theory is determined by the sig- 
nature consisting of the following relation and function symbols: the group 
operation o (a binary function symbol), the unit e (a constant), the inverse 
operation ~! (a unary function symbol) and finally equality = (a binary 
relation symbol). 

£-terms are inductively defined as follows. 


e Every variable is an £-term. 

e Every constant of £ is an £-term. 

e If t,,...,t, are £-terms and f is an n-ary function symbol of £ 
with n > 1, then f(t1,...,t,) is an £-term. 

From £-terms one constructs £L-prime formulas, also called atomic for- 
mulas of £: If t1,...,t, are terms and R is an n-ary relation symbol of L, 
then R(t1,...,tn) is an £-prime formula. 

L-formulas are inductively defined from £-prime formulas by 

e Every £-prime formula is an £-formula. 

e If A and B are £-formulas, then so are (A > B) (“if A, then B”), 
(AA B) (“A and B”) and (AV B) (“A or B”). 

e If A is an £-formula and z is a variable, then VrA (“for all x, A 
holds”) and 4xA (“there is an x such that A”) are £-formulas. 


Negation, classical disjunction, and the classical existential quantifier 
are defined by 


aA =A- 1, 
AV! B:=7A> AB 1, 
A¢A  := WaeA. 


Usually we fix a language £, and speak of terms and formulas instead 
of £-terms and £-formulas. We use 


r,s,t for terms, 

L,Y, z for variables, 

Cc for constants, 
P,Q,R for relation symbols, 
fig,h for function symbols, 


A,B,C,D_ for formulas. 
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DEFINITION. The depth dp(A) of a formula A is the maximum length 
of a branch in its construction tree. In other words, we define recursively 
dp(P) = 0 for atomic P, dp(Ao B) = max(dp(A),dp(B)) + 1 for binary 
operators 0, dp(oA) = dp(A) + 1 for unary operators o. 

The size or length |A| of a formula A is the number of occurrences of 
logical symbols and atomic formulas (parentheses not counted) in A: |P| = 1 
for P atomic, |Ao B| = |A|+|B|+1 for binary operators 0, | o A] = |A|+1 
for unary operators o. 


One can show easily that |A| + 1 < 24°(4)+1, 


NOTATION (Saving on parentheses). In writing formulas we save on 
parentheses by assuming that V,4,— bind more strongly than A,V, and 
that in turn A, V bind more strongly than —,< (where A @ B abbreviates 
(A — B)A(B — A)). Outermost parentheses are also usually dropped. 
Thus AA -=B — C is read as ((A A (=B)) — C). In the case of iterated 
implications we sometimes use the short notation 


A, — Ag >... An-1 — An for A; — (Ag >... (An-1 — An)...). 


To save parentheses in quantified formulas, we use a mild form of the dot 
notation: a dot immediately after Va or da makes the scope of that quantifier 
as large as possible, given the parentheses around. So Vz.A — B means 
Va(A — B), not (VzA) — B. 

We also save on parentheses by writing e.g. Rayz, Rtotit2 instead of 
R(x,y,z), R(to,t1,t2), where R is some predicate symbol. Similarly for a 
unary function symbol with a (typographically) simple argument, so fx for 
f(x), etc. In this case no confusion will arise. But readability requires that 
we write in full R( fx, gy, hz), instead of Rfxgyhz. 

Binary function and relation symbols are usually written in infix nota- 
tion, e.g. x + y instead of +(x, y), and x < y instead of <(x,y). We write 
t#s for a(t =s) andt ¢ s for =(t < s). 


1.2. Substitution, Free and Bound Variables. Expressions €, €’ 
which differ only in the names of bound variables will be regarded as iden- 
tical. This is sometimes expressed by saying that € and €’ are a-equivalent. 
In other words, we are only interested in expressions “modulo renaming of 
bound variables”. There are methods of finding unique representatives for 
such expressions, for example the namefree terms of de Bruijn [7]. For the 
human reader such representations are less convenient, so we shall stick to 
the use of bound variables. 

In the definition of “substitution of expression €’ for variable x in ex- 
pression €”, either one requires that no variable free in €’ becomes bound 
by a variable-binding operator in €, when the free occurrences of x are re- 
placed by €’ (also expressed by saying that there must be no “clashes of 
variables”), “E’ is free for x in E”, or the substitution operation is taken to 
involve a systematic renaming operation for the bound variables, avoiding 
clashes. Having stated that we are only interested in expressions modulo 
renaming bound variables, we can without loss of generality assume that 
substitution is always possible. 


4 1. LOGIC 


Also, it is never a real restriction to assume that distinct quantifier 
occurrences are followed by distinct variables, and that the sets of bound 
and free variables of a formula are disjoint. 


NOTATION. “FV” is used for the (set of) free variables of an expression; 
so FV(t) is the set of variables free in the term t, FV(A) the set of variables 
free in formula A etc. 

Elx := t] denotes the result of substituting the term ¢ for the variable 
az in the expression €. Similarly, €[7 := f] is the result of simultaneously 
substituting the terms t =t,,...,tp for the variables @ = 2,...,%p, respec- 
tively. 

Locally we shall adopt the following convention. In an argument, once 
a formula has been introduced as A(z), i.e., A with a designated variable zx, 
we write A(t) for A[x := t], and similarly with more variables. 


1.3. Subformulas. Unless stated otherwise, the notion of subformula 
we use will be that of a subformula in the sense of Gentzen. 
DEFINITION. (Gentzen) subformulas of A are defined by 
(a) A is a subformula of A; 
(b) if BoC is a subformula of A then so are B, C, for o = —,/,V; 
(c) if VeB or 4zB is a subformula of A, then so is B[x := t], for all t free 
for x in B. 


If we replace the third clause by: 
(c)’ if VeB or JxB is a subformula of A then so is B, 


we obtain the notion of literal subformula. 


DEFINITION. The notions of positive, negative, strictly positive subfor- 

mula are defined in a similar style: 

(a) A is a positive and a stricly positive subformula of itself; 

(b) if BAC or BV C is a positive [negative, strictly positive] subformula of 
A, then so are B, C; 

(c) if VaB or 4xB is a positive [negative, strictly positive] subformula of A, 
then so is Bix := t]; 

(d) if B — C is a positive [negative] subformula of A, then B is a negative 
[positive] subformula of A, and C is a positive [negative] subformula of 
A; 

(e) if B > C is a strictly positive subformula of A, then so is C. 


A strictly positive subformula of A is also called a strictly positive part 
(s.p.p.) of A. Note that the set of subformulas of A is the union of the 
positive and negative subformulas of A. Literal positive, negative, strictly 
positive subformulas may be defined in the obvious way by restricting the 
clause for quantifiers. 

EXAMPLE. (P > Q) — RAV«R'(x) has as s.p.p.’s the whole formula, 
RAV«R' (x), R, VzR'(x), R’(t). The positive subformulas are the s.p.p.’s 
and in addition P; the negative subformulas are P > Q, Q. 


2. Natural Deduction 


We introduce Gentzen’s system of natural deduction. To allow a direct 
correspondence with the lambda calculus, we restrict the rules used to those 
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for the logical connective — and the universal quantifier V. The rules come 
in pairs: we have an introduction and an elimination rule for each of these. 
The other logical connectives are introduced by means of axiom schemes: 
this is done for conjunction A, disjunction V and the existential quantifier 
4. The resulting system is called minimal logic; it has been introduced by 
Johannson in 1937 [14]. Notice that no negation is present. 

If we then go on and require the ex-falso-quodlibet scheme for the nullary 
propositional symbol 1 (“falsum”), we can embed intuitionistic logic. To 
obtain classical logic, we add as an axiom scheme the principle of indirect 
proof, also called stability. However, to obtain classical logic it suffices to 
restrict to the language based on —, V, | and A; we can introduce classical 
disjunction V“' and the classical existential quantifier 3° via their (classical) 
definitions above. For these the usual introduction and elimination proper- 
ties can then be derived. 


2.1. Examples of Derivations. Let us start with some examples for 
natural proofs. Assume that a first order language L is given. For simplicity 
we only consider here proofs in pure logic, i.e. without assumptions (axioms) 
on the functions and relations used. 


(1) (ASB 0)3(458)5 430 


Assume A — B— C. To show: (A > B) — A—C. So assume A — B. 
To show: A — C. So finally assume A. To show: C. We have A, by the last 
assumption. Hence also B — C, by the first assumption, and B, using the 
next to last assumption. From B — C and B we obtain C, as required. 


(2) (V2.4 B)>A—VeB, ifz ¢ FV(A). 


Assume Vx.4 — B. To show: A — VxB. So assume A. To show: VxB. 
Let x be arbitrary; note that we have not made any assumptions on x. To 
show: B. We have A — B, by the first assumption. Hence also B, by the 
second assumption. 


(3) (A> YVaeB)-Va.A—B, ifa dé FV(A). 


Assume A — VxB. To show: Vxz.A — B. Let x be arbitrary; note that we 
have not made any assumptions on x. To show: A — B. So assume A. To 


show: B. We have VxB, by the first and second assumption. Hence also 
B. 


A characteristic feature of these proofs is that assumptions are intro- 
duced and eliminated again. At any point in time during the proof the free 
or “open” assumptions are known, but as the proof progresses, free assump- 
tions may become cancelled or “closed” because of the implies-introduction 
rule. 

We now reserve the word proof for the informal level; a formal represen- 
tation of a proof will be called a derivation. 

An intuitive way to communicate derivations is to view them as labelled 
trees. The labels of the inner nodes are the formulas derived at those points, 
and the labels of the leaves are formulas or terms. The labels of the nodes 
immediately above a node v are the premises of the rule application, the 
formula at node v is its conclusion. At the root of the tree we have the 
conclusion of the whole derivation. In natural deduction systems one works 
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with assumptions affixed to some leaves of the tree; they can be open or else 
closed. 

Any of these assumptions carries a marker. As markers we use as- 
sumption variables Oo,Oi,..., denoted by u,v, w,uo,wi,.... The (previ- 
ous) variables will now often be called object variables, to distinguish them 
from assumption variables. If at a later stage (i.e. at a node below an as- 
sumption) the dependency on this assumption is removed, we record this by 
writing down the assumption variable. Since the same assumption can be 
used many times (this was the case in example (1)), the assumption marked 
with wu (and communicated by u: A) may appear many times. However, we 
insist that distinct assumption formulas must have distinct markers. 

An inner node of the tree is understood as the result of passing form 
premises to a conclusion, as described by a given rule. The label of the node 
then contains in addition to the conclusion also the name of the rule. In some 
cases the rule binds or closes an assumption variable u (and hence removes 
the dependency of all assumptions u: A marked with that wu). An application 
of the V-introduction rule similarly binds an object variable x (and hence 
removes the dependency on x). In both cases the bound assumption or 
object variable is added to the label of the node. 


2.2. Introduction and Elimination Rules for — and VY. We now 
formulate the rules of natural deduction. First we have an assumption rule, 
that allows an arbitrary formula A to be put down, together with a marker 
U: 

u: A Assumption 


The other rules of natural deduction split into introduction rules (I-rules 
for short) and elimination rules (E-rules) for the logical connectives — and 
V. For implication > there is an introduction rule >*u and an elimination 
rule —~, also called modus ponens. The left premise A — B in —~ is 
called major premise (or main premise), and the right premise A minor 
premise (or side premise). Note that with an application of the +*w-rule 
all assumptions above it marked with u: A are cancelled. 


[u: Al 
a [Mo IN 
B A—B Ae 2 
A>B 7 t 2 


For the universal quantifier V there is an introduction rule Vtx and an 
elimination rule V~, whose right premise is the term r to be substituted. 
The rule Vtz is subject to the following (Eigen-) variable condition: The 
derivation M of the premise A should not contain any open assumption with 
x as a free variable. 


| M | M 
A 4 VaA eae 
Vea ¥* Ale:=r]” 


We now give derivations for the example formulas (1) — (3). Since in 
many cases the rule used is determined by the formula on the node, we 
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suppress in such cases the name of the rule, 


u.A-BoC w:A v:A—B w:A 
BaC B 
C 
Aart (1) 
(A> B)-A-C rn 
(A> B3C)—(A>B)—- AC * 


u:Va2.A— B x 


Ver (2) 


3Ty 


(V2.A— B)— A—VaB 


Note here that the variable condition is satisfied: x is not free in A (and 
also not free in Vz.A — B). 


u: A— VxrB vi A 


VcB x 
BSB? e) 
Vix.A—-B Vira 


atu 


(A —V2B) -Vz.A— B 
Here too the variable condition is satisfied: x is not free in A. 
2.3. Axiom Schemes for Disjunction, Conjunction, Existence 
and Falsity. We follow the usual practice of considering all free variables 


in an axiom as universally quantified outside. 
Disjunction. The introduction axioms are 


Va :A—-AVB 
Vir :B-AVB 
and the elimination axiom is 


V~: (ASC) > (B-C)H—AVB-C. 


Conjunction. The introduction axiom is 
At: A> B—>AAB 
and the elimination axiom is 
A~:(ASBoC)—AABHC. 
Existential Quantifier. The introduction axiom is 


4+: As ArA 


and the elimination axiom is 


4”: (VW2.A—- B)-ArA—B (zx not free in B). 


8 1. LOGIC 


Falsity. This example is somewhat extreme, since there is no introduc- 
tion axiom; the elimination axiom is 


I7:L—3A. 


In the literature this axiom is frequently called “ex-falso-quodlibet” , written 
Efq. It clearly is derivable from its instances | — Rz, for every relation 
symbol R. 

Equality. The introduction axiom is 


Eq? : Eq(a, x) 
and the elimination axiom is 
Eq”: VeR(x,2) > Eq(z,y) — R(z,y). 


It is an easy exercise to show that the usual equality axioms can be derived. 

All these axioms can be seen as special cases of a general scheme, that 
of an inductively defined predicate, which is defined by some introduction 
rules and one elimination rule. We will study this kind of definition in full 
generality in Chapter 6. Eq(x,y) is a binary such predicate, L is a nullary 
one, and AV B another nullary one which however depends on the two 
parameter predicates A and B. 

The desire to follow this general pattern is also the reason that we have 
chosen our rather strange \~-axiom, instead of the more obvious AA B > A 
and A/A B — B (which clearly are equivalent). 


2.4. Minimal, Intuitionistic and Classical Logic. We write F A 
and call A derivable (in minimal logic), if there is a derivation of A without 
free assumptions, from the axioms of 2.3 using the rules from 2.2, but without 
using the ex-falso-quodlibet axiom, 1.e., the elimination axiom L~ for L. A 
formula B is called derivable from assumptions Aj,...,An, if there is a 
derivation (without L~) of B with free assumptions among Aj,..., Ap. Let 
I be a (finite or infinite) set of formulas. We write [+ B if the formula B 
is derivable from finitely many assumptions Aj,...,A, <I. 

Similarly we write -; A and I}; B if use of the ex-falso-quodlibet axiom 
is allowed; we then speak of derivability in intuttionistic logic. 

For classical logic there is no need to use the full set of logical connectives: 
classical disjunction as well as the classical existential quantifier are defined, 
by AV'B := 3A > AB > | and 3A := =Vx—A. Moreover, when dealing 
with derivability we can even get rid of conjunction; this can be seen from 
the following lemma: 

LEMMA (Elimination of A). For each formula A built with the connec- 
tives >,A,V there are formulas Aj,...,An without A such that} A 


ini Aj. 


Proor. Induction on A. Case Rt. Take n = 1 and A, := Rt. Case 
AA B. By induction hypothesis, we have Aj,..., An and By,..., Bm. Take 
Aj,.-.-,An, Bi,...,Bm. Case A — B. By induction hypothesis, we have 
Aj,.-.-,An and By,...,Bm. For the sake of notational simplicity assume 
n = 2 and m= 3. Then 


F (A, A Ag > By A Bo A Bs) 
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<= (A, — Ag — By) A (Ai — Ag = Bo) A (Ai @ Ao — Bs). 


Case VxA. By induction hypothesis for A, we have Aj,...,An. Take 
VaA1,...,V¢An, for 


i=1 


= i=1 


This concludes the proof. 


For the rest of this section, let us restrict to the language based on —, 
V, L and A. We obtain classical logic by adding, for every relation symbol 
R distinct from L, the principle of indirect proof expressed as the so-called 
“stability axiom” (Stab,): 
anRF = RZ. 


Let 
Stab := {VZ@.-7AR& — RZ | R relation symbol distinct from _ }. 


We call the formula A classically derivable and write +. A if there is a 
derivation of A from stability assumptions Stabe. Similarly we define clas- 
sical derivability from T’ and write [F, A, ice. 


Th, A :== TUStabek A. 


THEOREM (Stability, or Principle of Indirect Proof). For every formula 
A (of our language based on >, V, L and A), 


Fea7A = A. 


ProoF. Induction on A. For simplicity, in the derivation to be con- 
structed we leave out applications of >+ at the end. Case Rt with R 
distinct from L. Use Stabe. Case 1. Observe that -7L > 1 = ((1L > 
)— L)— L. The desired derivation is 


+ 
vi(Lol)ol ae 
all’ 
Case A — B. Use t (=7B — B) (A > B) > A— B; a derivation is 
ug: A-> B w:A 
U1: AB B 
irre 
v: an(A = B) «(A — B) 
werey 
u: 7AB— B 33B 
B 
Case VxA. Clearly it suffices to show F (-7A — A) VGA — A; a 
derivation is 
ug: V7 A x 
UL aA A 
+ 
vi aaVarA aVaA Stee 
+ 
u: 5A A a 


10 1. LOGIC 


The case AA B is left to the reader. 


Notice that clearly -, L — A, for stability is stronger: 


| Mstab : 
aAnA —_ ve 7 aa ee 
A 
a + 
i 


where Mstab is the (classical) derivation of stability. 

Notice also that even for the —, 1-fragment the inclusion of minimal 
logic in intuitionistic logic, and of the latter in classical logic are proper. 
Examples are 


PlLoP, but FLOP, 
Vi (P -Q)-> P)- P, but -.((P>Q)-P)-P. 


Non-derivability can be proved by means of countermodels, using a semantic 
characterization of derivability; this will be done in Chapter 2. -; L — P 
is obvious, and the Peirce formula ((P — Q) — P) — P can be derived in 
minimal logic from L — Q and =7=P — P, hence is derivable in classical 
logic. 


2.5. Negative Translation. We embedd classical logic into minimal 
logic, via the so-called negative (or Gédel-Gentzen) translation. 

A formula A is called negative, if every atomic formula of A distinct 
from | occurs negated, and A does not contain V, J. 


LEMMA. For negative A, / a7AA = A. 


ProoF. This follows from the proof of the stability theorem, using - 


Since V, J do not occur in formulas of classical logic, in the rest of this 
section we consider the language based on —, V, | and A only. 


DEFINITION (Negative translation 9 of Gédel-Gentzen). 


(Rt)! :=75Rt  (R distinct from 1) 
19 ee 

(AA B)9 := AIA BY, 

(A = B)9 := AI > BS, 

(V@A)I :=VarA9. 


THEOREM. For all formulas A, 


(a) Fe A AY, 
(b) TF, A iff T9+ AY, where TY := {BI| BET}. 


PROOF. (a). The claim follows from the fact that +, is compatible with 
equivalence. 2. <. Obvious =. By induction on the classical deriva- 
tion. For a stability assumption —ARt — Rf we have (~7Rt — Rt)9 = 
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373757Rt > 75Rt, and this is easily derivable. Case >+. Assume 


[u: Al 
D 
A = BO 
Then we have by induction hypothesis 
wu: AI pet 
D9 hence ] 
BI WG ape ee 
Case —~. Assume 
Do dD, 
A-B A 
B 
Then we have by induction hypothesis 
De Dt 
Db a hence 4g = BI ae 
AI— BI Ag BI 


The other cases are treated similarly. 


COROLLARY (Embedding of classical logic). For negative A, 
Fe A = >F A. 


PROOF. By the theorem we have F, A iff F A¥. Since A is negative, 
every atom distinct from L in A must occur negated, and hence in AY it 
must appear in threefold negated form (as =-7Rt). The claim follows from 
b a4Rt > ARE. 


Since every formula is classically equivalent to a negative formula, we 
have achieved an embedding of classical logic into minimal logic. 

Note that ’ -—~P — P (as we shall show in Chapter 2). The corollary 
therefore does not hold for all formulas A. 


3. Normalization 


We show in this section that every derivation can be transformed by 
appropriate conversion steps into a normal form. A derivation in normal 
form does not make “detours”, or more precisely, it cannot occur that an 
elimination rule immediately follows an introduction rule. Derivations in 
normal form have many pleasant properties. 

Uniqueness of normal form will be shown by means of an application of 
Newman’s lemma; we will also introduce and discuss the related notions of 
confluence, weak confluence and the Church-Rosser property. 

We finally show that the requirement to give a normal derivation of 
a derivable formula can sometimes be unrealistic. Following Statman [25] 
and Orevkov [19] we give examples of formulas Cy which are easily derivable 
with non-normal derivations (of size linear in k), but which require a non- 
elementary (in /) size in any normal derivation. 

This can be seen as a theoretical explanation of the essential role played 
by lemmas in mathematical arguments. 
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3.1. Conversion. A conversion eliminates a detour in a derivation, 
i.e., an elimination immediately following an introduction. We consider the 
following conversions: 

—-conversion. 


M 
A 
N a 
re iM 
B a B 
V-conversion. 
| 
/ 
A ave - IM 
VarA ae Algt=r| 
Alz := 1] 


3.2. Derivations as Terms. It will be convenient to represent deriva- 
tions as terms, where the derived formula is viewed as the type of the term. 
This representation is known under the name Curry-Howard correspondence. 

We give an inductive definition of derivation terms in the table below, 
where for clarity we have written the corresponding derivations to the left. 
For the universal quantifier V there is an introduction rule Vt and an 
elimination rule V~, whose right premise is the term r to be substituted. 
The rule Y*z is subject to the following (Eigen-) variable condition: The 
derivation term M of the premise A should not contain any open assumption 
with x as a free variable. 


3.3. Reduction, Normal Form. Although every derivation term car- 
ries a formula as its type, we shall usually leave these formulas implicit and 
write derivation terms without them. 

Notice that every derivation term can be written uniquely in one of the 
forms 

uM | AvM | (WuM)NLE, 
where wu is an assumption variable or assumption constant, v is an assump- 
tion variable or object variable, and M, N, L are derivation terms or object 
terms. 

Here the final form is not normal: (AvM)NL is called B-redex (for “re- 
ducible expression”). The conversion rule is 


(AvM)N +g M[v := N). 


Notice that in a substitution M[v := N] with M a derivation term and 
v an object variable, one also needs to substitute in the formulas of M. 
The closure of the conversion relation +g is defined by 
e If Mtg M’, then M — M’. 
e If M— M’, then also MN — M'N, NM — NM’, \vM — dvM' 
(inner reductions). 
So M — N means that M reduces in one step to N, i.e., N is obtained 
from M by replacement of (an occurrence of) a redex M’ of M by a con- 
versum M”" of M’, i.e. by a single conversion. The relation + (“properly 
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derivation term 
ur A u4 
[u: Al 
| M (uA ie? oP 
B + 
ASR © 
| M | N 
A>B A (MAA Nee 
B iia 
| M 
A\WaA : 
A Wee Gihansond) (AcM“) (with var.cond.) 
VcA 
| M 
VrA ip 7 (ALA poe 2 
Alz := 1] v 


TABLE 1. Derivation terms for — and V 


reduces to”) is the transitive closure of + and —* (“reduces to”) is the re- 
flexive and transitive closure of —. The relation —* is said to be the notion 


of reduction generated by +>. <—, —*, —* are the relations converse to 


—,—t,—*, respectively. 

A term M is in normal form, or M is normal, if M does not contain a 
redex. M has a normal form if there is a normal N such that M—* N. 

A reduction sequence is a (finite or infinite) sequence My — M, — 
M2... such that M; — Mj41, for all 7. 

Finite reduction sequences are partially ordered under the initial part 
relation; the collection of finite reduction sequences starting from a term 
M forms a tree, the reduction tree of M. The branches of this tree may 
be identified with the collection of all infinite and all terminating finite 
reduction sequences. 

A term is strongly normalizing if its reduction tree is finite. 


EXAMPLE. 


(ArdAyrz.22z(yz))(Aurv u)(Au' Av" u') > 
(AyAz.(Audv u)2z(yz)) (Au Av! u’) = 


14 1. LOGIC 


(AyAz.(Av z)(yz))(Au' Av’ u’) = 
(AyAz z)(Au’ dv! u’) — Azz. 
LEMMA (Substitutivity of —). (a) If M— M’, then MN — M'N. 
(b) If N > N’, then MN > MN’. 
(c) If M > M’, then M|v:= N] — M'|v:=N). 
(d) If N— N’, then M|v := N] >* Mv := N’). 
PROOF. (a) and (c) are proved by induction on M — M’; (b) and (d) 


by induction on M. Notice that the reason for —* in (d) is the fact that v 
may have many occurrences in M. 


3.4. Strong Normalization. We show that every term is strongly nor- 
malizing. 

To this end, define by recursion on k a relation sn(M, k) between terms 
M and natural numbers k with the intention that k is an upper bound on 
the number of reduction steps up to normal form. 


sn(M, 0) <=> UM is in normal form, 
sn(M,k+1) :< => sn(M’',k) for all M’ such that M— M’. 


Clearly a term is strongly normalizable if there is a k such that sn(M, k). 
We first prove some closure properties of the relation sn. 


LEMMA (Properties of sn). (a) If sn(M,k), then sn(M,k + 1). 


(b) If sn(MN,k), then sn(M,k). 

(c) If sn(M;,k;) fori =1...n, then sn(uM,...My, ky + +++ + kn). 
(d) Ifsn(M,k), then sn(AvM,k). 

(e) If sn(M[v := N]L,k) and sn(N,1), then sn((AvM)NL,k +1 +1). 


PROooF. (a). Induction on k. Assume sn(M,k). We show sn(M,k + 1). 
So let M’ with M — M' be given; because of sn(M,k) we must have k > 0. 
We have to show sn(M’,k). Because of sn(M, k) we have sn(M',k—1), hence 
by induction hypothesis sn(M’, k). 

(b). Induction on k. Assume sn(/N,k). We show sn(M,k). In case k = 
0 the term MN is normal, hence also M is normal and therefore sn(M, 0). 
So let k > 0 and M — M’; we have to show sn(M’,k — 1). From M — 
M' we have MN — M'N. Because of sn(MN,k) we have by definition 
sn(M'N,k — 1), hence sn(M’,k — 1) by induction hypothesis. 

(c). Assume sn(Mj,k;) for i = 1...n. We show sn(uM,...Mn,k) with 
k := ky +---+k,. Again we employ induction on k. In case k = 0 all 
M; are normal, hence also uM,...My. So let k > 0 and uM,...Myn > 
M'. Then M’ = uM,...M!...M, with M; — Mj; We have to show 
sn(ul,...M!...Mn,k —1). Because of M; — M! and sn(Mj, ki) we have 
kj > 0 and sn(M!, ki — 1), hence sn(uM...M}...Mn,k — 1) by induction 
hypothesis. 

(d). Assume sn(M,k). We have to show sn(AvM,k). Use induction on 
k. In case k = 0 M is normal, hence \vM is normal, hence sn(AvM,0). So 
let k > 0 and AXvM — L. Then L has the form AvM’ with M — M’. So 
sn(M',k — 1) by definition, hence sn(AvM’,k) by induction hypothesis. 

(e). Assume sn(M[v := N]L,k) and sn(N,1). We need to show that 
sn((AvM)NL,k +1+ 1). We use induction on k +1. In case k +1 = 0 the 
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term N and Mv := N |L are normal, hence also M and all L;. Hence there 
is exactly one term K such that (AuM)NLI — K, namely M[v := NJL, and 
this K is normal. So let k+1> 0 and (AvM)NL > K. We have to show 
sn(K,k +1). 

Case K = M[v := N]L, i.e. we have a head conversion. From sn(M[v := 
NL, k) we obtain sn(M[v := N|L,k +1) by (a). 

Case K = (\uM')NLI with M — M’. Then we have M[v := N]L > 
M'[v := NJL. Now sn(M[v := N]L,k) implies k > 0 and sn(M’[v := 
N]L,k —1). The induction hypothesis yields sn((AuM')NL,k —1+1+1). 

Case K = (\vM)N'L with N — N’. Now sn(N,/) implies | > 0 and 
sn(N’,1 — 1). The induction hypothesis yields sn((AvM)N’L,k +1—1+ 1), 
since sn(M[v := N’'JL, k) by (a), 

Case K = (AvM)NI! with L; > Li, for some i and L; = Li for 7 # i. 
Then we have M[v := N|JL > M[v := NL’. Now sn(M[v := NJL,k) 
implies k > 0 and sn(M[v := NJL’,k —1). The induction hypothesis yields 
sn((AvM)NL’,k —-1+1+1). 


The essential idea of the strong normalization proof is to view the last 
three closure properties of sn from the preceding lemma without the infor- 
mation on the bounds as an inductive definition of a new set SN: 


M €SN (Var) —eSN( M|[v:=N|JLESN NeSN 
uM € SN AvM € SN (AvM)NE € SN 


(3) 


COROLLARY. For every term M € SN there is ak € N such that 
sn(M,k). Hence every term M € SN is strongly normalizable 


PrRooF. By induction on M € SN, using the previous lemma. 


In what follows we shall show that every term is in SN and hence is 
strongly normalizable. Given the definition of SN we only have to show 
that SN is closed under application. In order to prove this we must prove 
simultaneously the closure of SN under substitution. 


THEOREM (Properties of SN). For all formulas A, derivation terms M € 
SN and N4 €SN the following holds. 
(a) M[v := N] € SN. 
(a’) M[x :=r] € SN. 
(b) Suppose M derives A— B. Then MN € SN. 
(b’) Suppose M derives VzA. Then Mr € SN. 


PRooF. By course-of-values induction on dp(A), with a side induction 
on M € SN. Let N4 € SN. We distinguish cases on the form of M. 

Case uM by (Var) from M € SN. (a). The SIH(a) (SIH means side 
induction hypothesis) yields M,[v := N] € SN for all M; from M. In case u 4 
v we immediately have (uM)[v := N] € SN. Otherwise we need NM[v := 
N] € SN. But this follows by multiple applications of IH(b), since every 
Mj|v := N] derives a subformula of A with smaller depth. (a’). Similar, and 
simpler. (b), (b’). Use (Var) again. 
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Case \vM by (A) from M € SN. (a), (a’). Use (A) again. (b). Our goal 
is (AvM)N € SN. By (@) it suffices to show M[v := N] € SN and N € SN. 
The latter holds by assumption, and the former by SIH(a). (b’). Similar, 
and simpler. 


Case (AwM)KL by (3) from M[w := K]L € SN and K € SN. (a). The 
SIH(a) yields Me v:= N\[w:= Kv:  NIJEfv := N] € SN and K[v:= N] € 
SN, hence (AwM[v := N])K[v := N]L[v := N] € SN by (). (a’). Similar, 
and simpler. (b), (b’). Use (@) again. 


COROLLARY. For every term we have M € SN; in particular every term 
M is strongly normalizable. 


ProoF. Induction on the (first) inductive definition of derivation terms 
M. In cases u and AvM the claim follows from the definition of SN, and in 
case MN it follows from the preceding theorem. 


3.5. Confluence. A relation R is said to be confluent, or to have the 
Church-Rosser property (CR), if, whenever Mp RM, and Mo R Mg, then 
there is an M3 such that MM, R M3 and Mj RMs3. A relation R is said to be 
weakly confluent, or to have the weak Church—Rosser property (WCR), if, 
whenever Mp R M,, Mo R Mo then there is an M3 such that MM, R* M3 and 
My R* M3, where R* is the reflexive and transitive closure of R. 

Clearly for a confluent reduction relation —* the normal forms of terms 
are unique. 


LEMMA (Newman 1942). Let —* be the transitive and reflexive closure 
of —, and let — be weakly confluent. Then the normal form w.r.t. > of 
a strongly normalizing M is unique. Moreover, if all terms are strongly 
normalizing w.r.t. >, then the relation —* is confluent. 


PRooF. Call M good if it satisfies the confluence property w.r.t. —*, 
i.e. if whenever K —* M —* L, then K —* N <—* LF for some N. We 
show that every strongly normalizing M is good, by transfinite induction on 
the well-founded partial order -*, restricted to all terms occurring in the 
reduction tree of M. So let MW be given and assume 


VM’'.M + M! => M’ is good. 
We must show that M is good, so assume K ——* M —* L. We may further 
assume that there are M’, M” such that K —* M’ — M — M” —* L, for 
otherwise the claim is trivial. But then the claim follows from the assumed 


weak confluence and the induction hypothesis for M’ and M”, as shown in 
the picture below. 


3.6. Uniqueness of Normal Forms. We first show that — is weakly 
confluent. From this and the fact that it is strongly normalizing we can 
easily infer (using Newman’s Lemma) that the normal forms are unique. 


PROPOSITION. — is weakly confluent. 


Proor. Assume No — M — N,. We show that No —* N —* Ny for 
some N, by induction on M. If there are two inner reductions both on the 
same subterm, then the claim follows from the induction hypothesis using 
substitutivity. If they are on distinct subterms, then the subterms do not 
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Kk TH(M') SN’ rp 
Pa TH(M”) 
aN" 


TABLE 2. Proof of Newman’s lemma 


overlap and the claim is obvious. It remains to deal with the case of a head 
reduction together with an inner conversion. 


(AuM)NL (uM 


where for the lower left arrows we have used substitutivity again. 


COROLLARY. Every term is strongly normalizing, hence normal forms 
are unique. 


3.7. The Structure of Normal Derivations. Let M be a normal 
derivation, viewed as a prooftree. A sequence of f.o.’s (formula occurrences) 
Ao,..-,An such that (1) Ap is a top formula (leaf) of the prooftree, and for 
0<i<n, (2) Aji is immediately below A;, and (3) A; is not the minor 
premise of an —~-application, is called a track of the deduction tree M. A 
track of order 0 ends in the conclusion of M; a track of order n+ 1 ends in 
the minor premise of an —~-application with major premise belonging to a 
track of order n. 

Since by normality an E-rule cannot have the conclusion of an J-rule as 
its major premise, the E-rules have to precede the I-rules in a track, so the 
following is obvious: a track may be divided into an E-part, say Ao,..., Aj—1, 
a minimal formula A;, and an I-part Ajii,..., Ap. In the E-part all rules 
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are E-rules; in the [-part all rules are Lrules; A; is the conclusion of an 
E-rule and, if i < n, a premise of an [-rule. It is also easy to see that 
each f.o. of M belongs to some track. Tracks are pieces of branches of the 
tree with successive f.o.’s in the subformula relationship: either A;+1 is a 
subformula of A; or vice versa. As a result, all formulas in a track Ao,..., An 
are subformulas of Ag or of An; and from this, by induction on the order 
of tracks, we see that every formula in M is a subformula either of an open 
assumption or of the conclusion. To summarize, we have seen: 


LEMMA. In anormal derivation each formula occurrence belongs to some 
track. 


PROOF. By induction on the height of normal derivations. 


THEOREM. In a normal derivation each formula is a subformula of either 
the end formula or else an assumption formula. 


PROOF. We prove this for tracks of order n, by induction on n. 


3.8. Normal Versus Non-Normal Derivations. We now show that 
the requirement to give a normal derivation of a derivable formula can some- 
times be unrealistic. Following Statman [25] and Orevkov [19] we give exam- 
ples of formulas C;, which are easily derivable with non-normal derivations 
(whose number of nodes is linear in /), but which require a non-elementary 
(in &) number of nodes in any normal derivation. 

The example is related to Gentzen’s proof in [9] of transfinite induction 
up to wy in arithmetic. There the function y ® w® plays a crucial role, and 
also the assignment of a “lifting”-formula A*(a) to any formula A(a), by 


At (x) := Vy.(Vz~<y) A(z) — (Vz < y @w") A(z). 


Here we consider the numerical function y + 2” instead, and axiomatize its 
graph by means of Horn clauses. The formula Cy expresses that from these 
axioms the existence of 2, follows. A short, non-normal proof of this fact 
can then be given by a modification of Gentzen’s idea, and it is easily seen 
that any normal proof of Cz must contain at least 2; nodes. 

The derivations to be given make heavy use of the existential quantifier 
‘! defined by =V-. In particular we need: 


LEMMA (Existence Introduction). + A > Ar A. 


Proor. \u4\v"*™“4..uru. 


LEMMA (Existence Elimination). + (-~B — B) = 327A > (Va.A > 
B)— Bifx«¢ FV(B). 


Proor. Au 72> yu V2 pw? 47 dus? vrrrdu4.u2(weru). 


Note that the stability assumption -~B — B is not needed if B does 
not contain an atom 4 | as a strictly positive subformula. This will be the 
case for the derivations below, where B will always be a classical existential 
formula. 

Let us now fix our language. We use a ternary relation symbol R to 
represent the graph of the function y+ 27; so R(y, x, z) is intended to mean 
y +2” = z. We now axiomatize R by means of Horn clauses. For simplicity 
we use a unary function symbol s (to be viewed as the successor function) 
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and a constant 0; one could use logic without function symbols instead — as 
Orevkov does —, but this makes the formulas somewhat less readable and 
the proofs less perspicious. Note that Orevkov’s result is an adaption of a 
result of Statman [25] for languages containing function symbols. 
Hyp: VyR(y, 0, s(y)) 
Hyp, : Vy, v2, a.R(y, x, z) ac R(z, z, 2) ma Ry, s(z), 21) 
The goal formula then is 
Ch = A zp, Ree ,z9-R(0, 0, Zk) A R(O, Zk, Zh—-1) Ded R(0, Z1, Zo). 
To obtain the short proof of the goal formula C;, we use formulas A;(2) with 
a free parameter 2. 
Ao(2) := Vy3"z R(y, 2, 2), 
Aisi(2) = Vy. Ai(y) > Sz. Ai(z) A Rly, @, 2). 
For the two lemmata to follow we give an informal argument, which can 
easily be converted into a formal proof. Note that the existence elimination 


lemma is used only with existential formulas as conclusions. Hence it is not 
necessary to use stability axioms and we have a derivation in minimal logic. 


LEMMA. + Hyp, — Hyp, — A;(0). 


PROOF. Case i = 0. Obvious by Hypy. 

Case i = 1. Let x with Ao(x) be given. It is sufficient to show Ao(s(x)), 
that is Vyi"'z,R(y, s(x), 21). So let y be given. We know 
(4) Ao(x) = Vya"'z R(y, 2, z). 
Applying (4) to our y gives z such that R(y,x,z). Applying (4) again to 
this z gives z; such that R(z,x, z,). By Hyp, we obtain R(y, s(x), 21). 

Case i+ 2. Let x with A;+1(x) be given. It suffices to show A;41(s(2)), 
that is Vy-Ai(y) > A%'z.Ai(z) A R(y, s(x), z). So let y with A;(y) be given. 
We know 


(5) Agyi(x) = Vy. Ai(y) 2 3°!z1.Aj(21) A R(y, 2, 21). 
Applying (5) to our y gives z such that A;(z) and R(y,x,z). Applying (5 
again to this z gives z, such that A;(z,) and R(z,2, z,). By Hyp» we obtain 
R(y, s(x), 21). 
Note that the derivations given have a fixed length, independent of 7. 
LEMMA. F Hyp, — Hypy > Cy. 


Proor. A,x(0) applied to 0 and Ax_1(0) yields z, with A,_j(z,) such 
that R(0,0, 2%). 

Ax_i(z%) applied to 0 and Ag_2(0) yields z,_1 with A,—2(zp-1) such 
that R(0, 2%, Zn-1)- 

Aj (22) applied to 0 and Ao(0) yields z1 with Ag(z1) such that R(0, 22, 21). 

Ag(z1) applied to 0 yields z9 with R(0, 21, zo). 


Note that the derivations given have length linear in k. 
We want to compare the length of this derivation of C;, with the length 
of an arbitrary normal derivation. 
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PROPOSITION. Any normal derivation of Cy from Hyp, and Hyp, has at 
least 2; nodes. 


PROooF. Let a normal derivation M of falsity 1 from Hyp,, Hyp, and 
the additional hypothesis 


U: Vp, +--+, 20-R(0, 0, 2%) > RO, Zk, 2-1) 2 +++ @ R(O, 21, 2) — L 


be given. We may assume that M does not contain free object variables 
(otherwise substitute them by 0). The main branch of M must begin with 
u, and its side premises are all of the form R(0, s"(0), s*(0)). 

Observe that any normal derivation of R(s™(0), s”(0), s*(0)) from Hypy, 
Hyp» and u has at least 2” occurrences of Hyp, and is such that k = m+ 2”. 
This can be seen easily by induction on n. Note also that such a derivation 
cannot involve u. 

If we apply this observation to the above derivations of the side premises 
we see that they derive 


R(0,0,52°(0)), R(0,s?°(0), 52" (0)), ... (0, 82*-1(0), s?4(0)). 


The last of these derivations uses at least 2?*-1 = 2;-times Hyp,. 


4. Normalization including Permutative Conversions 


The elimination of “detours” done in Section 3 will now be extended to 
the full language. However, incorporation of V, A and 4 leads to difficulties. 
If we do this by means of axioms (or constant derivation terms, as in 2.3), 
we cannot read off as much as we want from a normal derivation. If we 
do it in the form of rules, we must also allow permutative conversion. The 
reason for the difficulty is that in the elimination rules for V, A, 4 the minor 
premise reappears in the conclusion. This gives rise to a situation where we 
first introduce a logical connective, then do not touch it (by carrying it along 
in minor premises of V~,A~,4—), and finally eliminate the connective. This 
is not a detour as we have treated them in Section 3, and the conversion 
introduced there cannot deal with this situation. What has to be done is a 
permutative conversion: permute an elimination immediately following an 
V~,A7,47-rule over this rule to the minor premise. 

We will show that any sequence of such conversion steps terminates in 
a normal form, which in fact is uniquely determined (again by Newman’s 
lemma). 

Derivations in normal form have many pleasant properties, for instance: 


Subformula property: every formula occurring in a normal deriva- 
tion is a subformula of either the conclusion or else an assumption; 

Explicit definability: a normal derivation of a formula 4zA from 
assumptions not involving disjunctive of existential strictly positive 
parts ends with an existence introduction, hence also provides a 
term r and a derivation of A[x := r]; 

Disjunction property: a normal derivation of a disjunction AV B 
from assumptions not involving disjunctions as strictly positive 
parts ends with a disjunction introduction, hence also provides ei- 
ther a derivation of A or else one of B; 
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4.1. Rules for V, A and Jd. Notice that we have not given rules for 
the connectives V, A and 3. There are two reasons for this omission: 


e They can be covered by means of appropriate axioms as constant 
derivation terms, as given in 2.3; 

e For simplicity we want our derivation terms to be pure lambda 
terms formed just by lambda abstraction and application. This 
would be violated by the rules for V, A and 4, which require addi- 
tional constructs. 


However — as just noted — in order to have a normalization theorem with a 
useful subformula property as a consequence we do need to consider rules 
for these connectives. So here they are: 

Disjunction. The introduction rules are 


|“ |“ 


ee ee 
AVB AVB 
and the elimination rule is 
[u: Al [u: Bl 
| M |.N | K 
AVB C C ve 
ra U, U 
Conjunction. The introduction rule is 
| M |.N 
A B 
AaB ™ 
and the elimination rule is 
fu: A] [u: Bl 
| M |.N 
AAB C ere 
C 
Existential Quantifier. The introduction rule is 
| M 
r Ala:=r] _ 
daA st 
and the elimination rule is 
[u: Al 
| M |.N 
daA B 


a 4a, u (var.cond.) 
The rule 4-2, u is subject to the following (Eigen-) variable condition: The 
derivation N should not contain any open assumptions apart from u: A 
whose assumption formula contains x free, and moreover B should not con- 
tain the variable «x free. 

It is easy to see that for each of the connectives V, A, 4 the rules and the 
axioms are equivalent, in the sense that from the axioms and the premises 
of a rule we can derive its conclusion (of course without any V, A, 4-rules), 
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and conversely that we can derive the axioms by means of the V, A, J-rules. 
This is left as an exercise. 

The left premise in each of the elimination rules V~, A~ and d7 is called 
major premise (or main premise), and each of the right premises minor 
premise (or side premise). 


4.2. Conversion. In addition to the —,V-conversions treated in 3.1, 
we consider the following conversions: 
V -conversion. 


| M [u: Al [u: Bl | M 
A vt | N | ik = A 
AES C Cine | N 
U,V 
C C 
and 
| M [u: Al [u: Bl | M 
Bvt | N | K = B 
AVB C care | kK 
u,v 
C C 
/A-conversion. 
| M |.N [u: A] [v: B] | M |.N 
A Boy | i a” A B 
ARB Cae | K 
u,v 
C C 
d-conversion. 
| M [u: Al | M 
r Alx := 1] a |. N , Ale:=rl 
dA 2 Ba | IN’ 
d-2x,u 
B B 


4.3. Permutative Conversion. In a permutative conversion we per- 


mute an E-rule upwards over the minor premises of V~, A~ or 47. 
V-perm conversion. 


| M |N | K 
AV B C C | L ~ 
! 
C D C E-rule 
| N | L |K ine 
M / / 
A u B se E-rule le E-rule 
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A-perm conversion. 


| M | N 
AAB C | K a 
/ 
C D c E-rule 
| N | K 
M n 
A : Bp Ele 
D 
4d-perm conversion. 
| M | N 
d2A B | K ey 
2 D E-rule 
a | | Kk 
ad cs D e E-rule 
D 


4.4. Derivations as Terms. The term representation of derivations 
has to be extended. The rules for V, A and 4 with the corresponding terms 
are given in the table below. 

The introduction rule 3* has as its left premise the witnessing term r to 
be substituted. The elimination rule 4~u is subject to an (Eigen-) variable 
condition: The derivation term N should not contain any open assumptions 
apart from u: A whose assumption formula contains x free, and moreover 
B should not contain the variable x free. 


4.5. Permutative Conversions. In this section we shall write deriva- 
tion terms without formula superscripts. We usually leave implicit the extra 
(formula) parts of derivation constants and for instance write 3*, 47 instead 
of hae 2,A,B° So we consider derivation terms M,N, Kk of the forms 


u|dAvuM | AyM | ve M | VPM | (M,N) | atrM | 
MN | Mr | M(vo.No, 11-1) | M(v,w.N) | M(v.N); 
in these expressions the variables y,v, v9, v1, w get bound. 
To simplify the technicalities, we restrict our treatment to the rules for 


— and 4d. It can easily be extended to the full set of rules; some details for 
disjunction are given in 4.6. So we consider 


u|AvM | atrM | MN | M(w.N); 
in these expressions the variable v gets bound. 
We reserve the letters E,F,G for eliminations, i.e. expressions of the 


form (v.N), and R, S,T for both terms and eliminations. Using this notation 
we obtain a second (and clearly equivalent) inductive definition of terms: 


uM | uME | \vM | 3trM | 
(AuM)NR | d*+rM(v.N)R | uMERS. 
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derivation term 
| M | M ? 7 
VB VB 
Ae Bye | (ha? (vie) 
AVB AVB 
[u: Al lu: Bi 
Pe Ve Le (MAYB(uA.NC, »®_KC))O 
V~Uu,U 
C 
| | N 
A B z (MA, NEV AAP 
ANB‘ 
[u: Al fu: B] 
F ea a (MAB (uA, vB.NC))° 
A” U,V 
C 
| M 
= -—p]\t@A 
r Alx := 1] =i (aR geMar—) 
4aA a 
[u: Al 
“ Bi (M24 (uA.NB))? (var.cond.) 
aA 2 2 4~ x, u (var.cond.) 


TABLE 3. Derivation terms for V, A and J 


Here the final three forms are not normal: (AvM)NR and 3+rM(v.N)R 
both are G-redexes, and uMERS is a permutative redex. The conversion 
rules are 


(AvuM)N tog M|v := N] 


M(v.N)R 1, M(v.NR) 


G_,-conversion, 


at yrM(v.N) 4g N(x :=7][v:= M]  @3-conversion, 


permutative conversion. 


The closure of these conversions is defined by 
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e If Mtg M’ or Mt, M’, then M — M". 
e If M — M’, then also MR > M'R, NM > NM’, N(v.M) > 
N(v.M"), uM = AvM', At rM — AtrM’ (inner reductions). 


We now give the rules to inductively generate a set SN: 


M é&SN (Varo) MesSN (.) M «SN (3) 
uM € SN uM € SN dtrM € SN 
M,N €SN (v.NR)S € SN 


M 
(Var) a - (Var) 
uM 


uM(v.N) € SN (v.N)RS € SN 


Mlv:=N|RESN NeSN 
(AuM)NR € SN 


(G.) 


Nix :=rllv:= MJRESN MeSN 
at rM(v.N)R € SN 


where in (Var,) we require that v is not free in R. 

Write M| to mean that M is strongly normalizable, i.e., that every 
reduction sequence starting from M terminates. By analyzing the possible 
reduction steps we now show that the set Wf := { M | M| } has the closure 
properties of the definition of SN above, and hence SN C Wf. 


LEMMA. Every term in SN is strongly normalizable. 


PROOF. We distinguish cases according to the generation rule of SN 
applied last. The following rules deserve special attention. 
Case (Var,). We prove, as an auxiliary lemma, that 


uM(v.NR)S\| implies uM(v.N)RS\, 


by induction on uM(v.NR)S\ (i.e., on the reduction tree of this term). We 
consider the possible reducts of uM (v.N )RS. . The only interesting case is 
RS = (v'.N')TT and we have a permutative conversion of R = (v’.N’) with 
T, leading to the term M = uM(v.N) (v'.N’T)T. Now M| follows, since 


uM(v.NR)S = uM(v.N(v'.N'))TT 


leads in two permutative steps to uM(v.N(v'.N’T))T, hence for this term 
we have the induction hypothesis available. 

Case (G_.). We show that M[v := N]R| and N| imply (AuM)NR\. 
This is done by a induction on N|, with a side induction on M[v := NR. 
We need to consider all possible reducts of (AvM)NR. In case of an outer (- 
reduction use the assumption. If N is reduced, use the induction hypothesis. 
Reductions in M and in R as well as permutative reductions within R are 
taken care of by the side induction hypothesis. 

Case (33). We show that N[x :=r][v := M]R| and M| together imply 
It+rM(v.N)R. This is done by a threefold induction: first on M|, second 
on N(x :=rj[v := M]R| and third on the length of R. We need to consider 
all possible reducts of 3+rM(v.N)R. In case of an outer 6-reduction use the 
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assumption. If M is reduced, use the first induction hypothesis. Reductions 
in N and in R as well as permutative reductions within R are taken care 
of by the second induction hypothesis. The only remaining case is when 
R= SS and (v.N) is permuted with S, to yield I+rM(v.NS)S. Apply the 
third induction hypothesis, since (NS)[2 := r][v := M]S = N[x :=rl[v : 
MSS. 


For later use we prove a slightly generalized form of the rule (Var,): 
Proposition. If M(v.NR)S € SN, then M(v.N)RS € SN. 


Proor. Induction on the generation of M(v.NR)S € SN. We distin- 
guish cases according to the form of M. 

Case uT(v.NR)S € SN. If T = M, use (Var,). Otherwise we have 
uM(v'.N’)R(v.NR)S € SN. This must be generated by repeated applica- 
tions of (Var,) from uM(v!.N’R(v.N R)S) € SN, and finally by (Var) from 
M € SN and N’R(v.NR)S € SN. The induction hypothesis for the latter 
yields N’R(v.N)RS € SN, hence uM(v.N'R(v.N)RS) € SN by (Var) and 
finally uM(v.N’)R(v.N)RS € SN by (Var,). 

Case 3+rMT(v.NR)S € SN. Similarly, with (33) instead of (Var,). In 
detail: If T is empty, by (83) this came from (NR)[x := rv : Mjs 
Nla := rl[v == M]RS € SN and M € SN, hence 3+rM(v.N)RS € SN 
again by (93). Otherwise we have J+rM(v'.N’)T(v.NR)S € SN. This 
must be generated by (93) from N'[a := r][v! := M]T(v.NR)S € SN. The 
induction hypothesis yields N’[2 := r][u’! := M]T(v.N)RS € SN, hence 
3trM(v'.N')T(v.N)RS € SN by (G3). 

Case (\vM)N’R(w.NR)S € SN. By (@.,) this came from N’ € SN 
and M[v := N’|R(w.NR)S € SN. The induction hypothesis yields M[v := 
N'|R(w.N)RS € SN, hence (AvM)N'R(w.N)RS € SN by (3..). 


In what follows we shall show that every term is in SN and hence is 
strongly normalizable. Given the definition of SN we only have to show 
that SN is closed under —~ and 3~. In order to prove this we must prove 
simultaneously the closure of SN under substitution. 

THEOREM (Properties of SN). For all formulas A, 

(a) for all M € SN, if M proves A= Ag > A, and N € SN, then MN € SN, 
(b) for all M € SN, if M proves A= 3xB and N € SN, then M(v.N) € SN, 
(c) for all M € SN, if NA € SN, then M[v := N] € SN. 


ProorF. Induction on dp(A). We prove (a) and (b) before (c), and hence 
have (a) and (b) available for the proof of (c). More formally, by induction 
on A we simultaneously prove that (a) holds, that (b) holds and that (a), 
(b) together imply (c). 

(a). By induction on M € SN. Let M € SN and assume that M proves 
A= Ag — A; and N € SN. We distinguish cases according to how M € SN 
was generated. For (Varo), (Var), (G.) and (3) use the same rule again. 

Case uM(v.N’) € SN by (Var) from M,N’ € SN. Then N’N € SN by 
side induction hypothesis for N’, hence uM(v.N'N) € SN by (Var), hence 
uM(v.N')N € SN by (Var,). 
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Case (\vM)40>41 € SN by (A) from M € SN. Use (3-,); for this we 
need to know M|v := N] € SN. But this follows from TH(c) for WM, since N 
derives Apo. 

(b). By induction on M € SN. Let MM € SN and assume that M proves 
A = i2rB and N € SN. The goal is M(v.N) € SN. We distinguish cases 
according to how M ¢€ SN was generated. For (Var,), (G..) and (33) use 
the same rule again. 

Case uM € SN by (Varo) from M € SN. Use (Var). 

Case (StrM)7*4 € SN by (4) from M € SN. Use (3); for this we need 
to know N[ax := r][u := M] € SN. But this follows from IH(c) for N[x := r], 
since M derives Alx := 1). 

Case uM(v'.N’) € SN by (Var) from M,N’ € SN. Then N’(v.N) € SN 
by side induction hypothesis for N’, hence uM(v.N’(v.N)) € SN by (Var) 
and therefore uM(v.N’)(v.N) € SN by (Var). 

(c). By induction on M € SN. Let N4 € SN; the goal is M[v := N] € 
SN. We distinguish cases according to how M € SN was generated. For (A), 
4), (G.) and (3) use the same rule again. 

Case uM € SN by (Varo) from M € SN. Then M[v := N] € SN by 
SIH(c). If u 4 v, use (Varo) again. If u = v, we must show NM[v := 
N] € SN. Note that N proves A; hence the claim follows from (a) and the 
induction hypothesis. 

Case uM(v'.N’) € SN by (Var) from M,N’ € SN. If u 4 v, use (Var) 
again. If u = v, we must show NM[v := N](v’.N’[v := N]) € SN. Note 
that N proves A; hence in case M empty the claim follows from (b), and 
otherwise from (a) and the induction hypothesis. 

Case uM(v'.N’)RS € SN by (Var,) from uM (v’.N'R)S € SN. Ifu Fv, 
use (Var,,) again. If u = v, from the induction hypothesis we obtain 


“~~ 


> 


NM|v := N](v'.N'[v := N]Riv := N]).S[v := N] € SN 


Now use the proposition above. 


COROLLARY. Every term is strongly normalizable. 


ProorF. Induction on the (first) inductive definition of terms M. In 
cases u and AvM the claim follows from the definition of SN, and in cases 
MN and M(v.N) it follows from parts (a), (b) of the previous theorem. 


4.6. Disjunction. We describe the changes necessary to extend the 
result above to the language with disjunction V. 
We have additional G-conversions 


ViM(vo-No, v1.N1) eg M[v;:= Ni] v,-conversion. 
The definition of SN needs to be extended by 


M €SN (vi) 
ViM € SN 
M,No, M1 € SN uM (vp.NoR, v.N,R)S € SN 


zx (Vary) =i z 
uM (v9.No, 01-1) € SN uM (v9.No, 1.N1) RS € SN 


(Vary x) 
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Nilu:= MJRESN Nyj~RESN MeSN 
ViM(vo-No, v1.N)R € SN 
The former rules (Var), (Var;) should then be renamed into (Var3), (Var3,7). 

The lemma above stating that every term in SN is strongly normalizable 
needs to be extended by an additional clause: 

Case (@,). We show that Nj[v; := M]R|, Ni+R| and M| together im- 
ply V7 M(vo.No, v1.N,)RI. This is done by a fourfold induction: first on M |, 
second on N;/v; := M] Rl, Niky third on Ni_iR] and fourth on the length 
of R. We need to consider all possible reducts of Vi M(vo.No, v1.N,)R. In 
case of an outer (@-reduction use the assumption. If M is reduced, use the 
first induction hypothesis. Reductions in N; and in R as well as permutative 
reductions within R are taken care of by the second induction hypothesis. 
Reductions in Nj_; are taken care of by the third induction hypothesis. The 
only remaining case is when R = SS and (vo.No, v1-Ni) is permuted with 
S, to yield (vp.NoS,v1.N1S). Apply the fourth induction hypothesis, since 
(N;S)[v :-= M]S = N;[v := M]SS. 

Finally the theorem above stating properties of SN needs an additional 
clause: 

e for all M € SN, if M proves A = Ap V A, and No, Ni € SN, then 
M(vo.No, v1.1) ESN. 


(Bv;) 


PROoF. The new clause is proved by induction on M € SN. Let M € SN 
and assume that M proves A = Ag V Ay and No, N, € SN. The goal is 
M(vo.No,v1-N1) € SN. We distinguish cases according to how M € SN was 
generated. For (Var3,,), (Varv,r), (G4), (Ga) and (Gv,) use the same rule 
again. 

Case uM € SN by (Varo) from M € SN. Use (Vary). 

Case (Vj M)#°v41 € SN by (V;) from M € SN. Use ((v,); for this we 
need to know N;[v; := M] € SN and Ni_; € SN. The latter is assumed, 
and the former follows from main induction hypothesis (with N;) for the 
substitution clause of the theorem, since M derives Aj. 

Case uM(v'.N’) € SN by (Var3) from M,N’ € SN. For brevity let 
E := (vo.No,v1.Ni). Then N’E € SN by side induction hypothesis for N’, 
so uM (v'.N’E) € SN by (Vara) and therefore uM(v’.N’)E € SN by (Var3,7)- 

Case uM(v).Nj,v'-Ni) € SN by (Vary) from M, Ni, Ni © SN. Let 
E := (up.No,v1-N1). Then N/E € SN by side induction hypothesis for N/, 
so uM (v).NE, v}.N{E) € SN by (Vary) and therefore uM (v).Nj, vu, .NI)E € 
SN by (Vary,x)- 

Clause (c) now needs additional cases, e.g., 

Case uM(vo.No, 11-N1) € SN by (Vary) from M,No, Ni € SN. IfuF v, 
use (Vary). If u =v, we show NM[v := N](vo.No[v := N],v1-Ni[v := N]) € 
SN. Note that N proves A; hence in case M empty the claim follows from 
(b), and otherwise from (a) and the induction hypothesis. 


4.7. The Structure of Normal Derivations. As mentioned already, 
normalizations aim at removing local maxima of complexity, i.e. formula oc- 
currences which are first introduced and immediately afterwards eliminated. 
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However, an introduced formula may be used as a minor premise of an ap- 
plication of V~, A~ or 4”, then stay the same throughout a sequence of 
applications of these rules, being eliminated at the end. This also consti- 
tutes a local maximum, which we should like to eliminate; for that we need 
the so-called permutative conversions. First we give a precise definition. 


DEFINITION. A segment of (length n) in a derivation M is a sequence 
Aj,...,An of occurrences of a formula A such that 


(a) for 1 <i<n, A; is a minor premise of an application of V~, A~ or 3", 
with conclusion Aj+1; 

(b) Ap is not a minor premise of V~, A~ or J 

(c) Aj is not the conclusion of V~, A~ or 47 


(Note: An f.o. which is neither a minor premise nor the conclusion of an 
application of V~, A~ or 4” always belongs to a segment of length 1.) A 
segment is mazimal or a cut (segment) if An is the major premise of an 
E-rule, and either n > 1, or n = 1 and A, = A, is the conclusion of an 
[-rule. 


We shall use 0,0’ for segments. We shall say that o is a subformula of 
oa’ if the formula A in o is a subformula of B in o’. Clearly a derivation is 
normal iff it does not contain a maximal segment. 

The argument in 3.7 needs to be refined to also cover the rules for V, A, 4. 
The reason for the difficulty is that in the E-rules V~, A~, 47 the subformulas 
of a major premise A V B, AA B or 3xA of an E-rule application do not 
appear in the conclusion, but among the assumptions being discharged by 
the application. This suggests the definition of track below. 

The general notion of a track is designed to retain the subformula prop- 
erty in case one passes through the major premise of an application of a 
V~-,A7~,47--rule. In a track, when arriving at an A; which is the major 
premise of an application of such a rule, we take for A;,, a hypothesis 
discharged by this rule. 


DEFINITION. A track of a derivation M is a sequence of f.o.’s Ao,..., An 
such that 
(a) Ap is a top f.o. in M not discharged by an application of an V~,A~,47- 
rule; 
(b) A; for i < n is not the minor premise of an instance of >, and either 
(i) A; is not the major premise of an instance of a V~, \~,4~-rule and 
Aj11 is directly below A;, or 
(ii) A; is the major premise of an instance of a V-,A~,i--rule and 


Aj+1 is an assumption discharged by this instance; 
(c) An is either 
(i) the minor premise of an instance of —~, or 
(ii) the conclusion of M, or 
(iii) the major premise of an instance of a V~,A~,3~-rule in case there 
are no assumptions discharged by this instance. 


PROPOSITION. Let M be a normal derivation, and let 7 = 00,...,0n be 
a track in M. Then there is a segment o; in 7, the minimum segment or 
minimum part of the track, which separates two (possibly empty) parts of 7, 
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called the E-part (elimination part) and the I-part (introduction part) of 7 
such that 


(a) for each o; in the E-part one has j < i, oj is a major premise of an 
E-rule, and o;+41 is a strictly positive part of oj, and therefore each o; 
is a s.p.p. of 00; 

(b) for each oj; which is the minimum segment or is in the I-part one has 
i<j, and if j An, then o; is a premise of an Lrule and a s.p.p. of 
oj41, 80 each o; is a s.p.p. of On. 

DEFINITION. A track of order 0, or main track, in a normal derivation 
is a track ending either in the conclusion of the whole derivation or in the 
major premise of an application of a V~, A~ or 4~-rule, provided there are 
no assumption variables discharged by the application. A track of order 
n+ 1 is a track ending in the minor premise of an —~-application, with 
major premise belonging to a track of order n. 

A main branch of a derivation is a branch 7 in the prooftree such that 7 
passes only through premises of I-rules and major premises of E-rules, and 
am begins at a top node and ends in the conclusion. 


REMARK. By an obvious simplification conversion we may remove every 
application of an V~, A~ or 4~-rule that discharges no assumption variables. 
If such simplification conversion are performed, each track of order 0 in a 
normal derivation is a track ending in the conclusion of the whole derivation. 


If we search for a main branch going upwards from the conclusion, the 
branch to be followed is unique as long as we do not encounter an /A7*- 
application. 


LEMMA. In anormal derivation each formula occurrence belongs to some 
track. 
ProoF. By induction on the height of normal derivations. For example, 


suppose a derivation K ends with an 4~-application: 


[u: Al 


8 
a 
& 


at 


B in N belongs to a track 7 (induction hypothesis); either this does not 
start in wu: A, and then 7, B is a track in K which ends in the conclusion; or 
m starts in u: A, and then there is a track 7’ in M (induction hypothesis) 
such that 2’, 7,C is a track in K ending in the conclusion. The other cases 
are left to the reader. 


THEOREM (Subformula property). Let M be a normal derivation where 
every application of anV~, A\~ ord” -rule discharges at least one assumption 
variable. Then each formula occurring in the derivation is a subformula of 
either the end formula or else an assumption formula. 


PrRooF. As note above, each track of order 0 in M is a track ending in 
the conclusion of M/. We can now prove the theorem for tracks of order n, 
by induction on n. 
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THEOREM (Disjunction property). [fT does not contain a disjunction 
as 8.p.p. (= strictly positive part, defined in 1.3), then, iff / AV B, it 
follows that A or TF B. 


PROOF. Consider a normal derivation M of AV B from assumptions 
not containing a disjunction as s.p.p. The conclusion A V B is the final for- 
mula of a (main) track, whose top formula Ap in M must be an assumption 
in I. Since [ does not contain a disjunction as s.p.p., the segment o with 
the conclusion AV B is in the I-part. Skip the final Vi-rule and replace the 
formulas in o by A if i = 0, and by B ifi=1. 


There is a similar theorem for the existential quantifier: 
THEOREM (Explicit definability under hypotheses). Let [+ AvA. 
(a) IfT does not contain an existential s.p.p., then there are terms 11, Yo, 
sag tp such that PF Ala try) WV acev Ales nals 
(b) IfT neither contains a disjunctive s.p.p., nor an existential s.p.p., then 
there is a term r such that TF Ala :=r]. 


PROOF. Consider a normal derivation M of 47A from assumptions [ 
not containing an existential s.p.p. We use induction on the derivation, and 
distinguish cases on the last rule. 


(a). By assumption the last rule cannot be 3~. We only consider the 
case V~ and leave the others to the reader. 
[u: B] [v: C] 
| M | No | M1 
BVC darA ABA S55 
= VUu,U 
drA 


By assumption again neither B nor C can have an existential s.p.p. Applying 
the induction hypothesis to No and Ni we obtain 


[u: B] [v: Cl 
| No | M1 
pM Wi Ale = ri] 2 Wisns Ale := ri] . 
BVC Wa" Ale =r] Waa Ale = ri] 
V~Uu,U 


WE? 4lz =r] 


(b). Similarly; by assumption the last rule can be neither V~ nor 37 


REMARK. Rasiowa-Harrop formulas (in the literature also called Harrop 
formulas) are formulas for which no s.p.p. is a disjunction or an existential 
formula. For [ consisting of Rasiowa-Harrop formulas both theorems above 
hold. 


5. Notes 


The proof of the existence of normal forms w.r.t permutative conversions 
is originally due to Prawitz [20]. We have adapted a method developed by 
Joachimski and Matthes [13], which in turn is based on van Raamsdonk’s 
and Severi’s [28]. 


CHAPTER 2 
Models 


It is an obvious question to ask whether the logical rules we have been 
considering suffice, i.e. whether we have forgotten some necessary rules. To 
answer this question we first have to fix the meaning of a formula, i.e. we 
have to provide a semantics. 

This is rather straightforward for classical logic: we can take the usual 
notion of a structure (or model, or (universal) algebra). However, for min- 
imal and intuitionistic logic we need a more refined notion: we shall use 
so-called Beth-structures here. Using this concept of a model we will prove 
soundness and completeness for both, minimal and intuitionistic logic. As a 
corollary we will obtain completeness of classical logic, w.r.t. the standard 
notion of a structure. 


1. Structures for Classical Logic 


1.1. Structures. We define the notion of a structure (more accurately 
£-structure) and define what the value of a term and the meaning of a 
formula in such a structure should be. 


DEFINITION. M = (D,1) is a pre-structure (or £-pre-structure), if D 
a non-empty set (the carrier set or the domain of M) and I is a map 
(interpretation) assigning to every n-ary function symbol f of £ a function 


I(f): D® > D. 


In case n = 0, I(f) is an element of D. M = (D, Io, hh) is a structure (or 
£-structure), if (D, Jo) is a pre-structure and J; a map assigning to every 
n-ary relation symbol R of £ an n-ary relation 


I,(R) CD”. 


In case n = 0, 1;(R) is one of the truth values 1 and 0; in particular we 
require [,(L) = 0. 

If M = (D,J) or (D, Io, 11), then we often write |M| for the carrier set 
D of M and f™, R™ for the interpretations Io(f), I1(R) of the function 
and relation symbols. 

An assignment (or variable assignment) in D is a map 7 assigning to 
every variable x € dom(7) a value 7(x) € D. Finite assignments will be 
written as [v1 := @1,...,%n := Qn] (or else as [a1/21,...,4n/%n]), with 
distinct 1,...,%m. If 7 is an assignment in D and a € D, let 7% be the 
assignment in D mapping x to a and coinciding with 7 elsewhere, so 


a). J Uy), ifyAax 
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Let a pre-structure M and an assignment 7 in |M| be given. We define 
a homomorphic extension of 7 (denoted by 7 as well) to the set Set Ter of 
£-terms t such that vars(t) C dom(7) by 


n(c) ae 


n(f (tr, teey ta) = f“(n(t), ccs ,M(tn))- 


Observe that the extension of 7 depends on M; therefore we may also write 
My] for n(t). 

For every structure M, assignment 77 in |M| and formula A with FV(A) C 
dom(n) we define M — A[n] (read: A is valid in M under the assignment 
n) by recursion on A. 


ME R(t1,..-,tn)[n) <= (4 In],..., 4 [n]) © H(R) for R not O-ary. 
M E R[n] <=> 1,(R)=1 for R 0-ary. 

M — (AA B)[n| <= MEA[n] and M - Bln]. 

M &— (AV B)[n| <= MEA[n] or ME Bly]. 

ME (A= B)[n] <=> if ME Alm], then M — Bln). 

M & (VaxA)[n <== for alla € |M| we have M — A[nf]. 

M &— (aa@A)[n <=> there is an a € |M| such that M — A[n%]. 


Because of [;(L) = 0 we have in particular M A L[n]. 
If T is a set of formulas, we write M | I[n], if for all A € I we have 
ME Aln]. If M — Al[n] for all assignments 7 in |M|, we write M —- A. 


1.2. Coincidence and Substitution Lemma. 
LEMMA (Coincidence). Let M be a structure, t a term, A a formula and 
n,€ assignments in |M|. 


(a) If n(x) = €(x) for all x € vars(t), then n(t) = &(t). 
(b) If n(a) = €(a) for all « € FV(A), then M — Aln] iff M - A[€]. 


PROOF. Induction on terms and formulas. 


LEMMA (Substitution). Let M be an L-structure, t,r L-terms, A an 
L-formula and n an assignment in |M|. Then 


(a) n(rfe := t) = f(r). 
(b) M Ale := ¢][n] ME Aln?). 


PRooF. (a). Induction on r. (b). Induction on A. We restrict ourselves 
to the cases of an atomic formula and a universal formula; the other cases 
are easier. 

Case R(s1,...,5,). For simplicity assume n = 1. Then 


MF R(s)[x = #][n] MF R(s|x := ¢))[n] 
<> 7(s[x := t]) € R™ 
<= f(s) RM by (a) 
= ME R(s)[n2). 
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Case VyA. We may assume y # x and y ¢ vars(t). 
M F (WyA)[z = ¢[n 
= ME (VyAle := t])[N] 
<=> for alla ce |M|, M & Alz := #][n}] 
<=> for alla € |M|, M & Al(n?)2] with b := nf(t) = n(t) 
(by IH and the coincidence lemma) 
<> forallac|M|, ME Al(nb) 3], (because x # y) 
<> ME (VyA)[n'] 
This completes the proof. 


1.3. Soundness. We prove the soundness theorem: it says that every 
formula derivable in classical logic is valid in an arbitrary structure. 


THEOREM (Soundness). Let TF. B. If M is a structure and n an 
assignment in |M|, then M Tn] entails M - B[n]. 


PROOF. Induction on derivations. The given derivation of B from [ 
can only have finitely many free assumptions; hence we may assume [ = 
{A1,..., An}. 

Case u: B. Then B € T and the claim is obvious. 

Case Staby: V#¥.77R¢F — RX. Again the claim is clear, since M — 
—4Al[n] means the same as M — A[n]. 

Case —~. Assume M — I'[n]. We must show M — Bin]. By IH, 
M — (A — B)[n| and M — Al[n]. The claim follows from the definition of 
= 

Case +. Assume M — I[n]. We must show M — (A — B)[n]. So 
assume in addition M — Al[n]. We must show M — Bln]. By IH (with 
TU {A} instead of I) this clearly holds. 

Case Vt. Assume M | I[7]. We must show M — Al[n%]. We may 


assume that all assumptions A;,...,A, actually in the given derivation. 
Since because of the variable condition for Vt the variable x does not appear 
free in any of the formulas A;,...,An, we have by the coincidence lemma 


M | I[n%]. The IH (with 7% instead of 7) yields M —- Al[n®]. 
Case V~. Assume M — I'[n]. We must show M — Ala := t][n], i-e. by 

the substitution lemma M | A[n?] with b := n(t). By IH, M - (VxA)[n], 

i.e. M — Al[n?] for alla € |M|. With 7(t) for a the claim follows. 
The other cases are proved similarly. 


2. Beth-Structures for Minimal Logic 


2.1. Beth-Structures. Consider a partially ordered set of “possible 
worlds”. The worlds are represented as nodes in a finitely branching tree. 
They may be thought of as possible states such that all nodes “above” a 
node k are the ways in which & may develop in the future. The worlds are 
increasing, that is, if an atomic formula Rf true is in a world k, then Rt is 
true in all worlds “above” k. 

More formally, each Beth-structure is based on a finitely branching tree 
T. A node k over a set S is a finite sequence k = (ag, a1,...,@n—1) of 
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elements of S; Ih(k) is the length of k. We write k x k’ if k is the initial 
segment of k’. A tree on S' is a set of nodes closed under initial segments. A 
tree T is finitely branching if every node in T' has finitely many immediate 
successors. 

A tree T is unbounded if for every n € N there is a node k € T such that 
Ih(k) =n. A branch of T is a linearly ordered subtree of T. A leaf is a node 
without successors in 7’. 

For the proof of the completeness theorem, a Beth-structure based on 
a complete binary tree (i.e. the complete tree over {0,1}) will suffice. The 
nodes will be all the finite sequences of 0’s and 1’s, and the ordering is as 
above. The root is the empty sequence and k0 is the sequence k with the 
postfix 0. Similarly for k1. 


DEFINITION. Let (T,X) be a finitely branching tree. B = (D, Io, lh) is 
a £L-Beth-structure on T, where D is a nonempty set, and for each n-ary 
function symbol in £, Ip assigns f a map Ip(f): D" — D. For each n-ary 
relation symbol R in £ and each node k € T, (R,k) C D” is assigned in 
such a way that monotonicity is preserved, that is, 


kx k= I,(R, k) G T,(R,k’). 


If n = 0, then )(R,k) is either true or false, and it follows by the mono- 
tonicity that if k x k’ and 1,(R,k) then 11(R, k’). 


There is no special requirement set on J)(L,k). In minimal logic, falsum 
L plays a role of an ordinary propositional variable. 

For an assignment 7, t® [7] is understood classically. The classical sa- 
tisfaction relation M — A[n] is replaced with the forcing relation in Beth- 
structures. It is obvious from the definition that any JT’ can be extended 
to a complete tree T without leaves, in which for each leaf k € T all se- 
quences k0, £00, k000,... are added to T. For each node k0...0, we add 
1(R,k0...0) = 4(R,k). 


DEFINITION. B,k lt A[n] (B forces A at a node k for an assignment 7) 
is defined inductively as follows. We write k lt A[n| when it is clear from 
the context what the underlying structure B is, and we write Vk'=,k A for 
Vk'>k.lh(k’) = Ih(k) +n > A. 


kit R(ti,...,tp)[n] <=> InVk'= nk (El, ..., te [n]) € (RK, F’), 
if R is not O-ary. 


k |F Rin] <=> IUnVe’ =k (R,k')=1 if Ris O-ary. 
klk (AV B)[n] <=> InVk'>,k.k' |t A[n] or k’ IF Bl]. 

k lk (Aw A)[n <=> AnVk'>,k5aE|Bl k’ IF A[n9]. 

k lt (A= B)[n] <> VEER Aly) SR Ir Bilal. 

klk (AA B)[n] <= klk A[n] and klk Bly. 

k \t (Va A)[n <== Vae|Bl k lk Afni). 


The clauses for atoms, disjunction and existential quantifier include a 
concept of a “bar”, in T. 
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2.2. Covering Lemma. It is easily seen (from the definition and using 
monotonicity) that from k | A[n] and k x k’ we can conclude k’ |t A[n]. 
The converse is also true: 


LEMMA (Covering Lemma). 
Vk > nk k' IF Aly] = k IF Aly). 


ProoF. Induction on A. We write k lt A for k lk An]. 
Case Rt. Assume 


nvk'>nk k |k RE, 


hence by definition 


SnVk' > pkamVk" mk in] € 11(R, k"). 


Since T is a finitely branching tree, 


ImVk'> mk tn] € T(R, k’). 


Hence k |k Rt. 

The cases AV B and 4xA are handled similarly. 

Case A — B. Let k' |k A — B for all k’ > k with Ih(k’) = lh(k) +n. 
We show 


Wekllk ASIF B. 


Let 1 = k and 1 lk A. We show that / Ik B. We apply the IH to B 
and m := max(lh(k) + n,lh(Z)). So assume i’ > J and lh(i’) = m. It is 
sufficient to show I’ Ik B. If Ih(l’) = lh(l), then l’ = 1 and we are done. If 
Ih(l’) = Ih(k) +n > Ih(2), then l’ is an extension of J as well as of k and 
length Ih(k) +n, and hence I’ |k A > B by assumption. Moreover, I’ |r A, 
since l’> 1 and I/|t A. It follows that I’ IF B. 
The cases A A B and VxA are obvious. 


2.3. Coincidence and Substitution. The coincidence and substitu- 
tion lemmas hold for Beth-structures. 


LEMMA (Coincidence). Let B be a Beth-structure, t a term, A a formula 
and n, € assignments in |B. 


(a) If n(x) = €(x) for all x € vars(t), then n(t) = &(t). 
(b) If (a) = €(ax) for all « € FV(A), then B,k | A[n] B,k \t Alg]. 


PROOF. Induction on terms and formulas. 


LEMMA (Substitution). Let B be a Beth-structure, t,r terms, A a for- 
mula and n an assignment in |B|. Then 


(a) n(rle =) = nO (r). 
(b) B, klk Ala := d][n] B, klk Af). 


PROOF. Induction on terms and formulas. 
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2.4. Soundness. As usual, we proceed to prove soundness theorem. 


THEOREM (Soundness). Let [U{A} be a set of formulas such that + A. 
Then, if B is a Beth-structure, k a node and an assignment in |B\, it follows 
that B,k \t T'[n] entails B,k |r Aln]. 


PROOF. Induction on derivations. 

We begin with the axiom schemes Vj, Vj, V~, 3* and 3-. k lt C[n] is 
abbreviated & lk C’, when 7 is known from the context. 

Case Vie A— AV B. We show klk A— AV B. Assume for k’ > k 

that k’ Ik A. Show: k’ lk AV B. This follows from the definition, since 
k' lk A. The case Vf: B > AV B is symmetric. 
Case V-: AV B (A C) (B C) C. We show that 
klk AVB (A C) (B C) C. Assume for k’ > k that 
ki lk AV B, k’ lk AC and k' |-k B > C (we can safely assume that k’ is 
the same for all three premises ). Show that k’ Ik C. By definition, there 
is an n s.t. for all k” >, k’, k” |k A or k” IF B. In both cases it follows 
that k” Ik C, since k’ |k A > C and k’ Ik B > C. By the covering lemma, 
KAIF C. 

Case 4+: A — AxrA. Show that k Ik (A > ArA)[n]. Assume that k’ > k 
and k’ Ik A[n]. Show that k’ Ik (AxA)[n]. Since 7 = nl there is an a € |B| 
(namely a := 7(a)) such that k’ lk A[n?]. Hence, k’ Ik (dxA)[n]. 

Case 37: dxA — (Vz.A — B) > B and « ¢ FV(B). We show that 
k \k (AvwA — (Va.A > B) > B)[n|. Assume that k’ > k and k’ |t (AxvA)[n] 
and k’ IF (Vz.4 — B)[n]. We show k’ lt B[nj. By definition, there is 
an n such that for all k” >, k’ we have a € |B| and k” |F Al[n?]. From 
k' |t (Va.A — B)[n| follows that k” Ik B[n%], and since « ¢ FV(B), from 
the coincidence lemma, k” | Bin]. Then, finally, by the covering lemma 
k' \k Bin}. 

Case —+. Let k Ik T hold. We show that k |k 4 — B. Assume k’ > k 
and k’ lk A. Our goal is k’ Ik B. We have k’ |k TU {A}. Thus, k’ lk B by 
TH. 

Case —~. Let & lk T hold. The IH gives us k |k A > B and k It A. 
Hence & | B. 

Case Vt. Let k IF T'[n] and x ¢ FV(T) hold. Show that k IF (VxA)[n], 
i.e. k |F A[n?] for an arbitrary a € |B]. We have 


k\t T'[nf] by the coincidence lemma, since « ¢ FV(T) 
k |t Aln%] by TH. 


Case V~. Let k Ik T'[n]. We show that k& IF Ala := t][n]. We have 
k \t (WxA)[n] by IH 

k IF Alyn] by definition 

k \t Ala := t][n] by the substitution lemma. 


This concludes the proof. 


2.5. Counter Models. With soundness at hand, it is easy to build 
counter models for derivations not valid in minimal or intuitionistic logic. 
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A Beth-structure B = (D,Io, Ii) for intuitionistic logic is a Beth-struc- 
ture in which L is never forced, i.e. (1L,&) = 0 for all k. Thus, in Beth- 
structures for intuitionistic logic we have 


kik nA => Vk'>kk' Ih A, 
kIK AA &> Vk'ekk' I-AA 
<=> VE kak’ eK kl Ib A. 
As an example, we show that 4; ~~P — P. We describe the desired 


Beth-structure by means of a diagram below. Next to each node, we write 
the propositions forced on that node. 


The it is easily seen that 
() FP, () Ik =4P. 


Thus () If -=P — P and hence F —>P — P. Since for each R and all k, 
k |F Efqp, it also follows that 14; -7~P — P. The model also shows that the 
Pierce formula ((P — Q) > P) > P is invalid in intuitionistic logic. 


3. Completeness of Minimal and Intuitionistic Logic 


Next, we show the converse of soundness theorem, for minimal as well 
as intuitionistic logic. 


3.1. Completeness of Minimal Logic. 

THEOREM (Completeness). Let TU {A} be a set of formulas. Then the 
following propositions are equivalent. 
(a) TFA. 
(b) TIF A, i.e. for all Beth-structures B, nodes k and assignments 


B,k IFT [n] > B,k IF Aly. 


PROOF. Soundness is one direction. For the other direction we employ a 
technique developed by Harvey Friedman and construct a Beth-structure B 
(over the set Tp, of all finite 0-1-sequences & ordered by the initial segment 
relation k < k’) with the property that [+ B is equivalent to B, () IF Blid]. 

In order to define 6, we will need an enumeration Ao, A1, A2,... of L- 
formulas, in which each formula occurs countably many times. We also fix 
an enumeration #9, 21,... of variables. Let ! = U,, Pn be the union of finite 
sets [, such that T, C Tn4i. With each node k € To, we associate a finite 
set A; of formulas by induction on the length of k. 

Let Ay := 0. Take a node k such that Ih(k) = n and suppose that A, is 
already defined. Write AF, B to mean that there is a derivation of length 
<n of B from A. We define Azp and A; as follows: 

Case 1. Tn, Ap %n An. Then let 


Apo := Ap and Agy := Az U {An}. 


40 2. MODELS 


Case 2. [,, Ap fn An = Al, V Al. Then let 
Apo = Ag U{An, Ap} and Agy := Ap U {An, AP}. 
Case 3. [,, An Fn An = AvAi,. Then let 
By SH Ay Ay UA A eat 


x; is the first variable ¢ FV(T,, An, Ax). 
Case 4. Tn, Ag Fn An, and A, is neither a disjunction nor an existen- 
tially quantified formula. Then let 


Axo := Agi = Ax U {An}. 
Obviously k < k’ implies that A; C Az. We note that 
(6) Vk'=nkT, Ap + B>T,Agt B. 
It is sufficient to show 
T,Ag+ B and T,Ay, (> BST,A,t B. 


In cases 1 and 4, this is obvious. For cases 2 and 3, it follows immediately 


from the axiom schemes V~ and 4~. 
Next, we show 


(7) T, A, + B= dnvVk'=nk BE Ap. 
We choose n > Ih(k) such that B = Ay and [,, Ay Fn An. For all k’ > k, if 
Ih(k’) =n+1 then A, € Ay (cf. the cases 2-4). 

Using the sets A; we can define an £-Beth-structure B as (Terg, Io, i 


(where Tere denotes the set of terms of £) and the canonical Jp(f)t := f 
and 


) 
t 


fEK(R,k) > REE Ag. 
Obviously, ¢?[id] = ¢ for all £-terms t. 
We show that 


(8) T,A, + B B,k |k Blid], 


by induction on the complexity of B. For 6,k |t Blid] we write k lk B. 
Case Rt. The following propositions are equivalent. 


T, A, + RE 

InVk'> nk REE Ap by (7) and (6) 

Anvk'=,kt € 1,(R,k’) by definition of B 

kit Re by definition of It, since ¢[id] = t. 


Case BV C. =. Let T, A, / BY C. Choose an n > lh(k) such that 
Tn, Ag Fn An = BV C. Then, for all k’ > k s.t. Ih(k’) = n it follows that 


Apo = Ap U{BVC,B} and Apy = Ap U{BVC,C}, 
and by IH 


FOlk B and K1IFC. 
By definition, we have k|kK BVC. <. 


KIF BVC 
anv Hak K IPB or RF C 
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AnVk'=nk 1, Ay + B or T, Ay bk C by IH 


Snvk'>nk T, Apr BVC 
Tr, A,r BVC by (6). 


The case B A C is evident. 
Case B > C. =. Let T, A, | B— C. We must show k lk B > C, ie., 


VERE BS eRe: 


Let k’ > k be such that k’ lk B. By ITH, it follows that T, A, + B, and 
T, Ayr F C follows by assumption. Then again by IH k’ IF C. 

<. Let k lk B > C, ie. Vk'=k.k' Ik B= k’ Ik C. We show that 
T,A, | B > C. At this point, we apply (6). Choose an n > Ih(k) such 
that B = Ay. Let k’ =» k be such that m := n — Ih(k). We show that 
T,Ay | BoC. IfT, Ay Fy, An, then k’ Ik B by IH, and k’ Ik C by 
assumption, hence [, Ay + C again by IH and thus T', Ayr + B= C. 

If Tl, Ay Zn An then by definition Ayr = Ap U{B}, hence T, Agy F B, 
and k’1 |+ B by TH. Now k’1 | C by assumption, and finally T, Ay, / C by 
TH. From Ag, = Ay U {B}, it follows that T, Ay BoC. 

Case VxB. The following propositions are equivalent. 


TjAy FP ves 

VteTerc DT, A, + Bla := t] 

VteTerc klk Bla := t] by IH 

VteTerc k It Biid’ | by the substitution lemma, since t? [id] = t 
k lr VaB by definition of IF. 


Case AzB. This case is similar to the case V. The proof proceeds as 
follows. =. Let T, A, + S¢B. Choose an n > lh(k) such that T,, Ay Fy 
A,, = dxB. Then, for all k’ > k such that |h(k’) = n it follows that 


Apo = Agy = Az U {AxB, Bix = xil } 
where 2; is not free in A, U {S2B}. Hence by IH 
KOl Ble s— 95), atid kB lea): 


It follows by definition that k lk arB. <. 

k lk dxB 
AnWk'=p)kateTere k’ Ik Biid’] 
AnVk'=,kateTerg k! Ik Bix := t] 
Anvk'>,kateTercT, Ay, + Bla :=t] by IH 
anvk Sy kT Ape aeB 
T, A, srB by (6). 


Now, we are in a position to finalize the proof of the completeness the- 
orem. We apply (b) to the Beth-structure B constructed above from I, the 
empty node () and the assignment 7 = id. Then B, () Ir T'[id] by (8), hence 
B, () |r Alid] by assumption and therefore I + A by (8) again. 
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3.2. Completeness of Intuitionistic Logic. Completeness of intu- 
itionistic logic follows as a corollary. 
CorRoLiLAry. Let TU {A} be a set of formulas. The following proposi- 
tions are equivalent. 
(a) T by A. 
(b) T,Efq Ik A, i.e, for all Beth-structures B for the intuitionistic logic, 
nodes k and assignments n 


B,k IF Tn] > B,k IF Aly]. 


4. Completeness of Classical Logic 


We give a proof of completeness of classical logic relying on the com- 
pleteness proof for minimal logic above. Write [T / A to mean that, for all 
structures M and assignments 7, 


MET[n > MF Alnl. 


4.1. The Completeness Theorem. 

THEOREM (Completeness). Let [U{A} be a set of formulas (in £L). The 
following propositions are equivalent. 
(a) TF, A. 
(b) TEA. 


PROOF. Soundness is one direction. For the other direction, we adapt 
the completeness of minimal logic. 

Evidently, it is sufficient to treat formulas without V, 4 and A (by 
Lemma 2.4). 

Let D % A, ie., [,Stab b’ A. By the completeness theorem of minimal 
logic, there is a Beth-structure B = (Terg, Jo, 1) on the complete binary 
tree To; and a node Ig such that Jp Ik T, Stab and Ip If A (we write k Ik B 
for B,k |r Biid]). 

A node k is consistent if k |f L, and stable if k lk Stab. Let k be a stable 
node, and B a formula (without V, 4 and A). Then, Stab / —-B — B by 
the stability lemma. Hence, k lk ~=B — B, and 


kB <> kik AB 
(9) <=> dk'>k.k’ consistent and k’ |k =B. 


Let a be a branch in the underlying tree To1. We define 


alk A : dkeaklF A, 

a is consistent :<=> alf 1, 

a is stable : Skea k | Stab. 
Note that 
(10) from alt A and + A= B it follows that a lt B. 


To see this, consider a IF A. Then k Ik A forak € a, since a@ is linearly 
ordered. From + A => B it follows that k IF B, i.e., alF B. 
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A branch a is generic (in the sense that it generates a classical model) 
if it is consistent and stable, if in addition for all formulas B 
(11) alk B or alk 3B, 


and for all formulas V¥B (with ¥ not empty) where B is not universally 
quantified 


(12) VseTerc alt Big := 3] > al V9B 


For a branch a, we define a classical structure M° = (Terg, Jo, I?) as 


If(R):= Un(Rk) for RAL. 
kea 
We show that for every generic branch a and each formula B with all con- 
nectives in {—,V} 


(13) alk B M* — B. 


The proof is by induction on the logical complexity of B. 

Case Rt, R # L. Then the proposition holds for all a. 

Case L. We have a If 1 for all consistent a. 

Case B — C. =. Let alk B — C and M® —& B. We must show 
that M* — C. Note that a Ik B by IH, hence a IF C, hence M* — C 
again by IH. <=. Let M®° — B—C. If M* — B, then M® — C, hence 
a |I- C by IH and therefore a Ik B > C. If M® - B, then a |¥ B by IH, 
hence a Ik =B by (11) and therefore a Ik B — C, since a is stable (and 
F(anC > C)3 LC). 

Case V¥B (¥ not empty) where B is not universally quantified. The 
following propositions are equivalent. 


alt V¥B 
VseTerca lh Bly := § by (12) 
VseTerc M* —& Bly := 5| by IH 


M°* = VYyB. 
We show that for each consistent stable node k, there is a generic branch 
containing k. For the purposes of the proof, we let Ap, A1,... be an enumer- 


ation of formulas. We define a sequence k = ko < ki < ko... of consistent 
stable nodes inductively. Let ko := k. Assume that ky, is defined. We write 
A, in the form V¥B (¥ possibly empty) and B is not a universal formula. In 
case k, |k V¥B let kyn41 := ky. Otherwise we have k,, IY Bly := 3] for some 
§, and by (9) there is a consistent node k’ > kn such that k’ |k ~Bly := 5). 
Let kn := k’. Since ky X kn41, the node ky+1 is stable. 

Let a := {1 | dni x k,}, hence k € a. We show that a is generic. 
Clearly @ is consistent and stable. The propositions (11) and (12) can be 
proved simultaneously. Let C = VyB, where B is not a universal formula, 
and choose n, C = An. In case ky | ViiB we are done. Otherwise we have 
ky \- Bly := 5| for some §, and by construction k,41 lk aB[y := 3]. For (11) 
we get ky41 lk aV¥B (since + VY¥B — Bly := 3]), and (12) follows from the 
consistency of a. 

We are now in a position to give a proof of completeness. Since Ip IF A 
and Ig is stable, (9) yields a consistent node k = Ig such that k Ik A. 
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Evidently, & is stable as well. By the proof above, there is a generic branch 
a such that k € a. Since k Ik 4A it follows that alk A, hence M* = =A 
by (13). Moreover, a lk T, and M® — T follow by (13). Then, TF A. 


4.2. Compactness, Lo6wenheim-Skolem Theorem. The complete- 
ness theorem has many important corollaries. We mention only two. A set T 
of £-formulas is consistent if [ 4. L, and satisfiable if there is an £-structure 
M and an assignment 7 in |M| such that M — Bn] for all B ET. 


COROLLARY. Let T be a set of L-formulas. 
(a) IfT is consistent, then T is satisfiable. 


(b) (Compactness theorem). If each finite subset of T is satisfiable, T is 
satisfiable. 


PRooF. (a). From T . L we obtain T |F L by the completeness theo- 
rem, and this implies satisfiability of I. 

(b). Otherwise we have T — 1, hence [ +, L by the completeness 
theorem, hence also Ip +, | for a finite subset Tg C I, and therefore 
To E | contrary to our assumption that I) has a model. 


COROLLARY (Léwenheim and Skolem). Let T be a set of L-formulas (we 
assume that L is countable). If T is satisfiable, then T is satisfiable on an 
L-structure with a countable carrier set. 


ProoF. We make use of the proof of the completeness theorem with 
A=. It either yields [ +, L (which is excluded by assumption), or else a 
model of [TU {=L}, whose carrier set is the countable set Ter. 


5. Uncountable Languages 


We give a second proof of the completeness theorem for classical logic, 
which works for uncountable languages as well. This proof makes use of the 
axiom of choice (in the form of Zorns lemma). 


5.1. Filters and Ultrafilters. Let M #4 0 be a set. F C P(M) is 
called filter on M, if 
(a) Me F and O ¢ F; 
(b) if xX ¢€ F and X CY CM, then Y € F; 
(c) X,Y € F entails XNY e€ F. 


F is called ultrafilter, if for all X € P(M) 
XE€ForM\XeEF. 


The intuition here is that the elements X of a filter F are considered to be 
“big”. For instance, for M infinite the set F = {X C M | M \ X finite } is 
a filter. 


LEMMA. Suppose F is an ultrafilter and X UY € F. Then X € F or 
YeF. 


ProoF. If both X and Y are not in F, then M\ X and M\Y are in 
F,, hence also (M \ X)N(M\Y), which is M\ (X UY). This contradicts 
the assumption X UY € F. 
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Let M #4 @ be a set and S C P(M). S has the finite intersection 


property, if X,N+--OX, #9 for all X1,..., Xn € S andalln EN. 


LEMMA. If S has the finite intersection property, then there exists a 
filter F on M such that FDS. 


Proor. F := {xX |X DX1N---N Xp, for some X1,...,Xn € Sh. 


LEMMA. Let M # @ be a set and F a filter on M. Then there is an 
ultrafilter U on M such that U D F. 


Proor. By Zorns lemma (which will be proved - from the axiom of 
choice - in Chapter 5), there is a maximal filter U with F C U. We claim 
that U is an ultrafilter. So let X C M and assume X ¢ U and M\ X € U. 
Since U is maximal, U U {X} cannot have the finite intersection property; 
hence there is a Y € U such that YN X = 9. Similary we obtain Z € U 
such that 79 (M \ X) =@. But then YN Z = 0, a contradiction. 


5.2. Products and Ultraproducts. Let M #4 @ be a set and A; #0 
sets for 2 € M. Let 
Il A; :={a| a is a function, dom(a) = M and a(i) € A; for allie M}. 
ieM 
Observe that, by the axiom of choice, [],<,¢ Ai 0. We write a € []jeq7 Ai 
as (a(t) |ie M). 

Now let M#@ be aset, F a filter on M and A; structures for i € M. 
Then the F-product structure A= Tey A; is defined by 


(a) |Al = T]ieae |Ai| (notice that |A] 4 0). 
(b) for an n-ary relation symbol R and ay,...,an € |A| let 
RA(ay,...,0n) <> {16M | R*(ar(i),...,an()} €F. 
(c) for an n-ary function symbol f and aj,...,a@ € |A| let 
fA(an,...,0n) = ( fAi(ar(i),...,an(i)) |ie M). 
For an ultrafilter U we call A = ibe mu Ai the U-ultraproduct of the A; for 
we M. 


5.3. The Fundamental Theorem on Ultraproducts. The prop- 
erties of ultrafilters correspond in a certain sense to the definition of the 
consequence relation |. For example, for an ultrafilter U we have 

ME (AV B)[n] = ME Aly] or ME Bln] 
XUYEU = XEUorYEU 


and 


M - ~A[n] MK Aln] 
X€U <> M\XEU. 


This is the background of the following theorem. 
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THEOREM (Fundamental Theorem on Ultraproducts, Los 1955). Let 


A= Ch A; be an U-ultraproduct, A a formula and 7 an assignment 
in |A| . Then we have 


AF Aln] — {t€M| Ai F Alm] } €U, 


where n; is the assignment induced by n(x) = n(x)(t) fori Ee M. 


PROOF. We first prove a similar property for terms. 
(14) t“[n] = (t*‘[n] |i € M). 


The proof is by induction on t. For a variable the claim follows from the 
definition. Case ft,...t,. For simplicity assume n = 1; so we consider ft. 
We obtain 


(ft)“[n] = F404) 
= fA((t™ In] lie M)) by TH 
= ((ft)“ [ni] |¢ € M). 
Case Rt,...t,. For simplicity assume n = 1; so consider Rt. We obtain 
A Rt[n] <= RACH) 

<> {ie M|R*(HM4[@)} EU 

=> {ie M| R*(E*[n))} EU by (14) 

—= {ie M|AE- Rtl[n]} €U. 


Case A — B. 
A (A= B)[n] 
<=> if A Al[n], then A & B[n] 
——7 if{ie M| A; E Aln] } € U, then {i € M | A; E Bln] } eU 
by IH 
=—> {ie M|Ai = Aln]} ¢U or {16 M|A; E Bln] } eu 
=—> {ie M|Ai E AA[ni]} €U or {4 M | A; E Bly] } eU 
for U is an ultrafilter 
— {ie M|A F(A B)[n)} eV. 
Case Va A. 
AF (VaA)In] 
<=> forallae€|A|, AF Al[n?] 
<> forallace|Al, {ie M| AE Al(n)oO]} ©U by IH 
<=> {1€M|forallae |Aj|, Ai = Al(m)2]} €U see below 
<= {1€M|A;- (VzA)[n]} € U. 
It remains to show (15). Let X := {i € M | for all a € |A;|, Ai E A[(m)4] } 


and Ya = {16 M| A; E Al(mo]} for a € |Al. 
<. Let a € |A| and X € U. Clearly X C Yq, hence also Y, € U. 


(15) 
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=. Let Yo € U for all a. Assume X ¢ U. Since U is an ultrafilter, 
M\X={i¢€M | there is an a € |A;| such that A; / A[(7)%] } € U. 

We choose by the axiom of choice an ap € |A| such that 


A= some a € |A;| such that A; | Al(7)¢] ifie M\ X, 
oan arbitrary € |Ail otherwise. 


Then Ya, 9 (M \ X) = 0, contradicting Y,,,M \ X € U. 


If we choose A; = B constant, then A = LL B satisfies the same for- 
mulas as 6 (such structures will be called elementary equivalent in section 6; 
the notation is A = B). Ths B is called an ultrapower of B. 


5.4. General Compactness and Completeness. 


COROLLARY (General Compactness Theorem). Every finitely satisfiable 
set T of formulas is satisfiable. 


Proor. Let M := {i CIT’ | é finite}. For i € M let A; be a model of 
4 under the assignment 7. For AcCT let Za:= {ie M|AEei}={ic 
I'| i finite and A €i}. Then F :={2Z,|A€T} has the finite intersection 
property (for {Aj,...,An} © Z4,N...Z,,). By the lemmata in 5.1 there is 
an ultrafilter U on M such that F’ C U. We consider A := Ilz mu Ai and the 
product assigment 7 such that n(x)(7) := m(x), and show A — I'[n]. So let 
A€T. By the theorem it suffices to show X4 := {i € M| A; EF Alm] } € U. 
But this follows form Z4 C X4 and Z4€¢ F CU. 


An immediate consequence is that if [ |= A, then there exists a finite 
subset IY CT such that I’ — A. 

For every set I’ of formulas let L(T) be the set of all function and relation 
symbols occurring in I. If £ is a sublanguage of £’, M an L-structure and 
M' an £L’-structure, then M’ is called an expansion of M (and M a reduct of 
M'), if |M| =|M"'|, f = f for all function symbols and RM = R™" for 
all relation symbols in the language £. The (uniquely determined) £-reduct 
of M’ is denoted by M’[L. If M’ is an expansion of M and 7 an assignment 
in |M|, then clearly [7] = t'[n] for every L-term t and M — Afr] iff 
M' — A[n| for every £-formula A. Hence the validity of f —/ A does not 
depend on the underlying language £, as long as L([ U {A}) C £ (or more 
precisely C Fung U Relz). 

COROLLARY (General Completeness Theorem). Let TU {A} be a set of 
formulas, where the underlying language may be uncountable. Then 


TRA = TEA. 


PROOF. One direction again is the soundness theorem. For the converse 
we can assume (by the first remark above) that for some finite I’ CT we 
have I’ — A. But then we have I’ — A in a countable language (by the 
second remark above). By the completeness theorem for countable languages 
we obtain I’ F, A, hence also TF, A. 
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6. Basics of Model Theory 


In this section we will (as is common in model theory) also allow uncount- 
able languages £. As we have just seen, completeness as well as compactness 
hold for such languages as well. 


6.1. Equality Axioms. We first consider equality axioms. So we as- 
sume in this section that our underlying language £ contains a binary rela- 
tion symbol =. The set Eq, of £-equality axioms consists of (the universal 
closures of) 

x=x (reflexivity), 
L=y—>y=x (symmetry), 
L=yry=z-2x=2 (transitivity), 
LY = Yip Or PS ln = Yn > fore. on = yt. -Yns 
T= Yt Fin = Yn > RV... An > Ry --- Yn; 
for all n-ary function symbols f and relation symbols R of the language CL. 
LEMMA (Equality). (a) Eq; Ft=s orlx:=t]=rlx:= sj. 
(b) Eqp Ft = 8 > (A[z = t] © Ala := 58}]). 
PRoorF. (a). Induction on r. (b). Induction on A; we only consider 


the case VyA. Then (VyA)[v := r] = VyA[ax := r], and by IH we have 
Eq; }t=s— Alx:=t] — Alzx := 5]. This entails the claim. 


An L-structure M satisfies the equality axioms iff —™ is a congruence 
relation (i.e., an equivalence relation compatible with the functions and 
relations of M). In this section we assume that all £-structures considered 
M satisfy the equality axioms. The coincidence lemma then also holds with 
= instead of =: 


LEMMA (Coincidence). Let 7 and € be assignments in |M| such that 
dom(n) = dom(€) and n(x) =™ E(x) for all x € dom(n). Then 
(a) [yn] =“ te [€] if vars(t) C dom(n) and 
(b) MF Aln] M F Alé] if FV(A) C dom(). 


ProoF. Induction on t and A, respectively. 


6.2. Cardinality of Models. Let M/=™ be the quotient structure, 
whose carrier set consists of congruence classes. We call a structure M 
infinite (countable, of cardinality n), if M/=™ is infinite (countable, of 
cardinality 7). 

By an axiom system T we understand a set of closed formulas such that 
Eqrir) CP. A model of an axiom system [ is an £-structure M such that 
L(T) C£ and MET. For sets T of closed formulas we write 


Modc(T) := {M | M is an £-structure and M — TU Eg; }. 


Clearly T is satisfiable iff [ has a model. 


THEOREM. If an axiom system has arbitrarily large finite models, then 
it has an infinite model. 
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ProoFr. Let I be such an axiom system. Suppose 29,%1,%2,... are 
distinct variables and 
Peat Uf ao ee, |'t.9 © Neuch that 7< 9}. 
By assumption every finite subset of I” is satisfiable, hence by the general 


compactness theorem so is I’. Then we have M and 7 such that M — I"[n 
and therefore n(x;) #™ n(a#j;) for i < j. Hence M is infinite. 


6.3. Complete Theories, Elementary Equivalence. Let £ be the 
set of all closed £-formulas. By a theory T we mean an axiom system closed 
under Fe, so Eqn) C T and 


T={AEL(T)|TKA}. 
A theory T is called complete, if for every formula A € L(T), T +, A or 
Th, 7A. 

For every £-structure M (satisfying the equality axioms) the set of all 
closed £-formulas A such that M | A clearly is a theory; it is called the 
theory of M and denoted by Th(M). 

Two £-structures M and M’ are called elementarily equivalent (written 
M = M’), if Th(M) = Th(M’). Two L-structures M and M’ are called 
isomorphic (written M = M’), if there is a map 7: |M| — |M’| inducing a 
bijection between |M/=™| and |M'/="|, so 

Va, bE|M|.a =“ b (a) =" 1(b), 
(Va'e|M'|)(Sae|M|) m(a) =" a’, 
such that for all aj,...,an € |M| 
m(fM(a1,...,4n)) = f" (w(a1),.-., (an), 
RM(a,..-,4n) <=> RM (n(a1),...,7(an)) 
for all n-ary function symbols f and relation symbols R of the language CL. 


We first collect some simple properties of the notions of the theory of a 
structure M and of elementary equivalence. 


LEMMA. (a) Th(M) ist complete. 
(b) IfT is an axiom system such that L(T) CL, then 
{AEL|TH. A} =(}{ Th(M) | M € Mode (I) }. 


(c) MSM = METH(M’). 
(d) If £ is countable, then for every L-structure M there is a countable 
L-structure M’ such that M = M'. 


PROOF. (a). Let M be an L-structure and A € £L. Then M § A or 
M — WA, hence Th(M)F, A or Th(M) F, 7A. 
(b). For all A € £ we have 
Tr,A =P EA 
<= for all £-structures M, (MET > M &- A) 
<= for all £-structures M, (M € Mod¢(T) > A € Th(M)) 


<=> AE[ ){ Th(M) | M € Modg(F) }. 
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(c). =. Assume M = M’ and A € Th(M’). Then M’' — A, hence 
MEA. 

<=. Assume M — Th(M’). Then clearly Th(M’) C Th(M). For the 
converse inclusion let A € Th(M). If A ¢ Th(M’), by (a) we would also 
have =A € Th(M’), hence M — —A contradicting A € Th(M). 

(d). Let £ be countable and M an L-structure. Then Th(M) is satis- 
fiable and therefore by the theorem of Lowenheim and Skolem possesses 
a satisfying C-structure M’ with a countable carrier set Tere. By (c), 
M=M'. 


Moreover, we can characterize complete theories as follows: 


THEOREM. Let T be a theory and £L = L(T). Then the following are 
equivalent. 


(a) T is complete. 
(b) For every model M € Modc(T), Th(M) = T. 
(c) Any two models M,M!' € Modc(T) are elementarily equivalent. 


PROOF. (a) = (b). Let T be complete and M € Modc(T). Then 
M — T, hence T C Th(M). For the converse assume A € Th(M). Then 
AA ¢ Th(M), hence =A ¢ T and therefore A € T. 

(b) = (c) is clear. 

(c) > (a). Let A € £ and T . A. Then there is a model Mo of 
TU {AA}. Now let M € Mod;(T) be arbitrary. By (c) we have M = Mo, 
hence M — 7A. Therefore TF, 7A. 


6.4. Elementary Equivalence and Isomorphism. 


LEMMA. Let 7 be an isomorphism between M and M'. Then for all 
terms t and formulas A and for every sufficiently big assignment n in |M| 


(a) w(t [n]) =" [a on] and 
(b) M - Aln] M! i Alon}. In particular, 


M2=M>MeM. 


PROOF. (a). Induction on t. For simplicity we only consider the case of 
a unary function symbol. 


m(a™[n]) = m(n(x)) = nee on] 
m(cIn)) = 2(eM) =" 
( 


m((fty (nl) = (f(t In) 
ae Ge tal) 
a lel al 7) 

= (ft) [ron]. 


(b). Induction on A. For simplicity we only consider the case of a unary 
relation symbol and the case Vx A. 


M & Rt(n) —> RM(t™ (n)) 
<=> RM (n(t™ [n))) 


=> RO. [7 0 n]) 
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M'-E Rt[r on], 


M E VaAln] 


for all a € |M|, M - Al[n?] 

for alla € |M|, M' — Alro nf] 

for alla € |M|, M' — Al(zo ne”) 
for all a’ € |M’|, M’ K A[(z 0 n)2] 
M! = VrAlmon] 


tir 


This concludes the proof. 


The converse, i.e. that M = M’ implies M & M’, is true for finite 
structures (see exercise sheet 9), but not for infinite ones: 


THEOREM. For every infinite structure M there is an elementarily equi- 
valent structure Mo not isomorphic to M. 


Proor. Let =™ be the equality on M := |M|, and let P(M) denote 
the power set of M. For every a € P(M) choose a new constant cq. In the 
language L’ := LU {cq | a € P(M) } we consider the axiom system 


DT := Th(M)U {ca 4 cg | a, 8 € P(M) and a F B}UEqrp:. 


Every finite subset of T is satisfiable by an appropriate expansion of M. 
Hence by the general compactness theorem also I is satisfiable, say by Mo. 
Let Mo := MG/L. We may assume that =“? is the equality on |Mo|. Mo 
is not isomorphic to M, for otherwise we would have an injection of P(M) 
into M and therefore a contradiction. 


6.5. Non Standard Models. By what we just proved it is impossi- 
ble to characterize an infinite structure by a first order axiom system up 
to isomorphism. However, if we extend first order logic by also allowing 
quantification over sets X, we can formulate the following Peano axioms 


Yn S(n) £0, 
YnVm.S(n) = S(m) — n=™m, 
VX.0€ X — (Vn.n€ X > S(n) € X) 3 Vnne X. 


One can show easily that (N,0,S) is up to isomorphism the unique model 
of the Peano axioms. A structure which is elementarily equivalent, but not 
isomorphic to V’ := (N,0,S), is called a non standard model of the natural 
numbers. In non standard models of the natural numbers the principle of 
complete induction does not hold for all sets X CN. 

Similarly, a structure which is elementarily equivalent, but not isomor- 
phic to (R,0,1,+,-,<) is called a non standard model of the reals. In every 
non standard model of the reals the completeness axiom 


VX.0 A X bounded — Ay.y = sup(X) 
does not hold for all sets _X C R. 


THEOREM. There are countable non standard models of the natural num- 
bers. 
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PRooF. Let x be a variable and 
T:=ThW)U{xz#n|neEN}, 


where 0 := 0 and n+1:= Sn. Clearly every finite subset of I is satisfiable, 
hence by compactness also [. By the theorem of L6wenheim and Skolem we 
then have a countable or finite M and an assignment 7 such that M — I'[y]. 
Because of M — Th(N) we have M = N by 6.3; hence M is countable. 
Moreover n(x) 4“ n™ for all n € N, hence M #N. 


6.6. Archimedian Ordered Fields. We now consider some easy ap- 
plications to well-known axiom systems. 
The axioms of field theory are (the equality axioms and) 


r+ytays(etyts 2 =(-y)-2, 


O+2=2, l-r=a, 
(-z) +2 =0, a#0 7a }-e=1, 
Lry=yre, LY=YX, 


and also 
(e@t+y)-2=(@-z)+(y-2), 
140. 

Fields are the models of this axiom system. 


In the theory of ordered fields one has in addition a binary relation 
symbol < and as axioms 


uxa, 

B<Yoy<72FU<z, 

a<yVla=yVvily <a, 

GT<Kyrr+e<ytZ, 

0<4-0<yo0<a-y. 
Ordered fields are the models of this extended axiom system. An ordered 
field is called archimedian ordered, if for every element a of the field there 


is a natural number n such that a is less than the n-fold multiple of the 1 
in the field. 


THEOREM. For every archimedian ordered field there is an elementarily 
equivalent ordered field that is not archimedian ordered. 


ProoF. Let K be an archimedian ordered field, x a variable and 
P:=ThK)U{n<2|neEN}. 


Clearly every finite subset of I’ is satisfiable, hence by the general compact- 
ness theorem also T. Therefore we have M and 7 such that M — In]. 
Because of M — Th(K) we obtain M = K and hence M is an ordered 
field. Moreover 1“. n <“ n(x) for all n € N, hence M is not archimedian 
ordered. 
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6.7. Axiomatizable Structures. A class S of £-structures is called 
(finitely) axiomatizable, if there is a (finite) axiom system I such that 
S = Mod¢(T). Clearly S is finitely axiomatizable iff S = Mod,¢({A}) for 
some formula A. If for every M € S there is an elementarily equivalent 
M' € S, then S cannot possibly be axiomatizable. By the theorem above 
we can conclude that the class of archimedian ordered fields is not axiom- 
atizable. It also follows that the class of non archimedian ordered fields is 
not axiomatizable. 


LEMMA. Let S be a class of £L-structures and T an axiom system. 


(a) S is finitely axiomatizable iff S and the complement of S are axiomati- 
zable. 

(b) If Mode (TL) is finitely axiomatizable, then there is a finite go CT such 
that Mod¢(I9) = Mod¢(T). 


PRooF. (a). Let 1 — S denote the complement of S. 

=. Let S = Mode({A}). Then MeE1-S — M -E XA, hence 
1—S =Modc({-A}). 

<=. Let S = Mod,(T;) and 1— S = Modg(f2). Then Ty UP is not 
satisfiable, hence there is a finite [ C Ty such that [ UT is not satisfiable. 
One obtains 


MESSMETSMEMSMEL-SSMES, 


hence S = Mod; (TL). 
(b). Let Mod¢(I) = Modc({A}). Then T — A, hence also [9 — A for a 
finite [9 CT. One obtains 


MeaTSMeEIpSMeEASMeET, 
hence Mod¢ (Ip) = Modg(T). 


6.8. Complete Linear Orders Without End Points. Finally we 
consider as an example of a complete theory the theory DO of complete 
linear orders without end points. The axioms are (the equality axioms and) 


uL a, a<yodstza<zAz<y, 
wr<yroy<7z-@<Z, ya <y, 


a<yVicayVly <a, yy <a. 


LEMMA. Every countable model of DO is isomorphic to the structure 
(Q, <) of rational numbers. 


Proor. Let M = (M, <) be a countable model of DO; we can assume 
that =™ is the equality on M. Let M={b,|n€N} and Q={a,|neé 
N}, where we may assume a, # Gm and b, # by for n < m. We define 
recursively functions f, C Q x M as follows. Let fo := {(ao, bo)}. Assume 
we have already constructed fn. 

Case n+1 = 2m. Let j be minimal such that b; ¢ ran(f,). Choose a; ¢ 
dom(f,,) such that for all a € dom(f;,) we have a; < a bj < fn(a); such an 
a; exists, since M and (Q, <) are models of DO. Let fr4i := fn U {(ai, b;)}- 

Case n+ 1=2m-+1. This is treated similarly. Let 7 be minimal such 
that a; ¢ dom(fn). Choose b; ¢ ran(f,) such that for all a € dom(f,,) we 
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have a; <a bj < f(a); such a b; exists, since M and (Q, <) are models 
of DO. Let fri := fr U {(ai, b;)}. 

Then {bo,...,b0m} © ran(fom) and {ao,...,@m4i} C dom(fom+i) by 
construction, and f := U,, fn is an isomorphism of (Q, <) onto M. 


THEOREM. The theory DO is complete, and DO = Th(Q, <). 


PROOF. Clearly (Q,<) is a model of DO. Hence by 6.3 it suffices to 
show that for every model M of DO we have M = (Q, <). So let M model 
of DO. By 6.3 there is a countable M’ such that M = M’. By the preceding 
lemma M’ ~ (Q, <), hence M = M’ = (Q,<). 


A further example of a complete theory is the theory of algebraically 
closed fields. For a proof of this fact and for many more subjects of model 
theory we refer to the literature (e.g., the book of Chang and Keisler [6]). 


7. Notes 


The completeness theorem for classical logic has been proved by Gédel 
[10] in 1930. He did it for countable languages; the general case has been 
treated 1936 by Malzew [17]. Lowenheim and Skolem proved their theorem 
even before the completeness theorem was discovered: L6wenheim in 1915 
[16] und Skolem in 1920 [24]. 

Beth-structures for intuitionistic logic have been introduced by Beth 
in 1956 [1]; however, the completeness proofs given there were in need of 
correction. 1959 Beth revised his paper in [2]. 


CHAPTER 3 
Computability 


In this chapter we develop the basics of recursive function theory, or as 
it is more generally known, computability theory. Its history goes back to 
the seminal works of Turing, Kleene and others in the 1930’s. 

A computable function is one defined by a program whose operational 
semantics tell an idealized computer what to do to its storage locations as 
it proceeds deterministically from input to output, without any prior re- 
strictions on storage space or computation time. We shall be concerned 
with various program-styles and the relationships between them, but the 
emphasis throughout will be on one underlying data-type, namely the nat- 
ural numbers, since it is there that the most basic foundational connections 
between proof theory and computation are to be seen in their clearest light. 

The two best-known models of machine computation are the Turing 
Machine and the (Unlimited) Register Machine of Shepherdson and Sturgis 
[22]. We base our development on the latter since it affords the quickest 
route to the results we want to establish. 


1. Register Machines 


1.1. Programs. A register machine stores natural numbers in registers 
denoted u, v, w, x, y, 2 possibly with subscripts, and it responds step by 
step to a program consisting of an ordered list of basic instructions: 

Io 
qi 


Ty-1 

Each instruction has one of the following three forms whose meanings are 
obvious: 

Zero: x := 0 

Succ: :=a2+1 

Jump: if x = y then I,, else I, 

The instructions are obeyed in order starting with Jo except when a condi- 
tional jump instruction is encountered, in which case the next instruction 
will be either J,, or I, according as the numerical contents of registers x 
and y are equal or not at that stage. The computation terminates when it 
runs out of instructions, that is when the next instruction called for is Ip. 
Thus if a program of length & contains a jump instruction as above then it 
must satisfy the condition m,n < k and I, means “halt”. Notice of course 
that some programs do not terminate, for example the following one-liner: 


if =z then Ip else , 
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1.2. Program Constructs. We develop some shorthand for building 
up standard sorts of programs. 
Transfer. “x := y” is the program 


xz :=0 
if c = y then I, else Ip 
g:=at+l1 


if x =z then J, else I, 


which copies the contents of register y into register x. 
Predecessor. The program “x := y= 1” copies the modified predecessor 
of y into x, and simultaneously copies y into z: 


xr:=0 

z:=0 

if « = y then Ig else [3 
zi=Zz+1 

if z = y then Ig else I; 
zi=Zz+1 

r:i=a4+1 


if z = y then Ig else Is. 


Composition. “P ; Q” is the program obtained by concatenating pro- 
gram P with program Q. However in order to ensure that jump instructions 
in Q of the form “if « = y then J,,, else [,,” still operate properly within Q 
they need to be re-numbered by changing the addresses m,n to k+m,k+n 
respectively where k is the length of program P. Thus the effect of this 
program is to do P until it halts (if ever) and then do Q. 

Conditional. “if « = y then P else Q fi” is the program 


if x =y then J; else Ip4 
:P 
if «=a then Iy494; else Ig 


:Q 
where k,/ are the lengths of the programs P,Q respectively, and again their 
jump instructions must be appropriately renumbered by adding 1 to the 
addresses in P and k& + 2 to the addresses in Q. Clearly if « = y then 
program P is obeyed and the next jump instruction automatically bipasses 
Q and halts. If « 4 y then program Q is performed. 
For Loop. “for i=1...x do P od” is the program 

1:=0 

if ¢ =7 then J;,44 else Io 

Stl 

oP 

if ¢ =7 then J,44 else Io 
where again, & is the length of program P and the jump instructions in 
P must be appropriately re-addressed by adding 3. The intention of this 
new program is that it should iterate the program P x times (do nothing 
if ¢ = 0). This requires the restriction that the register x and the “local” 
counting-register 7 are not re-assigned new values inside P. 
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While Loop. “while z £0 do P od” is the program 
if x = 0 then Jp, else 
oP 
if x = 0 then Ip. else 


where again, k is the length of program P and the jump instructions in P 
must be re-addressed by adding 1. This program keeps on doing P until (if 
ever) the register x becomes 0. 


1.3. Computable Functions. A register machine program P may 
have certain distinguished “input registers” and “output registers”. It may 
also use other “working registers” for scratchwork and these will initially be 
set to zero. We write P(x1,...,v%;y) to signify that program P has input 
registers 71,...,@, and one output register y, which are distinct. 


DEFINITION. The program P(21,...,2%;y) is said to compute the k-ary 
partial function y: N* > N if, starting with any numerical values n1,...,n% 
in the input registers, the program terminates with the number m in the 
output register if and only if y(n1,...,n,) is defined with value m. In this 
case, the input registers hold their original values. 

A function is register machine computable if there is some program which 
computes it. 


Here are some examples. 
Addition. “Add(x, y; z)” is the program 
z:=a23;fori=1,...,ydo z:=z+1 0d 


which adds the contents of registers x and y into register z. 
Subtraction. “Subt(z,y;z)” is the program 


z:=a23;fori=1,...,ydow:=z-+132:=wod 


which computes the modified subtraction function x = y. 
Bounded Sum. If P(a1,...,2%,w;y) computes the & + l-ary function y 
then the program Q(#1,...,%%, 232): 


r:=0;3 
fori=1,...,2dow:=i+13; P(%,w;y) 3 v:=a 3 Add(v,y;x) od 


computes the function 


DO Gis te ZS So eee 2) 


Wz 


which will be undefined if for some w < z, y(a1,...,@%, w) is undefined. 
Multiplication. Deleting “w:=%i-+ 1; P” from the last example gives a 
program Mult(z, y;z) which places the product of y and z into x. 
Bounded Product. If in the bounded sum example, the instruction x := 
x +1 is inserted immediately after x := 0, and if Add(v, y; x) is replaced by 
Mult(v, y; 2), then the resulting program computes the function 


W(@1,..., 2k, Zz) = II p(i,...,2R, Ww). 


WKZ 
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Composition. If P;(x1,...,2%; yj) computes y; for each j = 7,...,m and 
if Po(y1,---; Ym; yo) computes yo, then the program Q(x1,...,2%3 yo): 


Pi Gio Oe Vis oie § Pea Pinter Ui) PO yO) 
computes the function 
w(x, . a Lk) = pol(yi(11,---, Lx) ) Pil Pin: .- ,Lk)) 


which will be undefined if any of the y-subterms on the right hand side is 
undefined. 

Unbounded Minimization. If P(x1,...,2%,y; Z) computes ¢ then the pro- 
gram Q(#1,...,%%} 2): 


y:=03 2:=03 2:=24+1; 
while z 40 do P(21,...,%,,y32) 3 yx=y+1od; 
2S y=1 


computes the function 


w(a1,...,2e) = py (p(21,...,2%,y) = 0) 


that is, the least number y such that y(21,..., 2%, y') is defined for every 
y! <y and lay cs pes) = 0. 


2. Elementary Functions 


2.1. Definition and Simple Properties. The elementary functions 
of Kalmar (1943) are those number-theoretic functions which can be defined 
explicitly by compositional terms built up from variables and the constants 
0,1 by repeated applications of addition +, modified subtraction ~ , bounded 
sums and bounded products. 

By omitting bounded products, one obtains the subelementary functions. 

The examples in the previous section show that all elementary functions 
are computable and totally defined. Multiplication and exponentiation are 


elementary since 
mn= Som and m”=[[m 
i<n i<n 
and hence by repeated composition, all exponential polynomials are elemen- 
tary. 
In addition the elementary functions are closed under 
Definitions by Cases. 


f(n 


gi(%) otherwise 


y= {te if h(7t) =0 


since f can be defined from go, g, and h by 


FH) = gol) - (1+ RG) + gi(m) (1 = (1 = h(i). 
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Bounded Minimization. 
f(7i,m) = wk<m (g(a, k) = 0) 


since f can be defined from g by 


Note: this definition gives value m if there is no k < m such that g(7,k) = 
0. It shows that not only the elementary, but in fact the subelementary 
functions are closed under bounded minimization. Furthermore, we define 
wk<m (g(7i,k) = 0) as wk<m-+1 (g(7, k) = 0). Another notational conven- 
tion will be that we shall often replace the brackets in uk<m (g(v,k) = 0) 
by a dot, thus: wk<m. g(7i,k) = 0. 


LEMMA. 
(a) For every elementary function f: N” > N there is a number k such that 
for all mi =n4,...,M,r, 


f (vi) < 2, max(7) 


where 29(m) =m and 2p41(m) = 2260"), 
(b) Hence the function n> 2,(1) is not elementary. 


PRooF. (a). By induction on the build-up of the compositional term 
defining f. The result clearly holds if f is any one of the base functions: 


f(@) =0 or 1 orn, or m% + ny or ny = 1; 


If f is defined from g by application of bounded sum or product: 


f(vi,m) = S- gi, 4) or II g (7, 1) 


i<m i<m 
where g(7i,i) < 2, max(7,i) then we have 
f(vi,m) < 2, max(7i,m)™ < 2449 max(fi,m) 
(using m™ < 2?""). If f is defined from go, 91,.--,9, by composition: 
f(t) = go(gi(%), --.,.g(7)) 
where for each j <1 we have g;(—) < 2x,(max(—)), then with k = max; kj, 
f (7) < 24(2¢ max(7)) = 22% max(7) 


and this completes the first part. 
(b). If 2,(1) were an elementary function of n then by (a) there would 
be a positive k such that for all n, 


2,,(1) < 2eln) 


but then putting n = 2;(1) yields 29, (1)(1) < 224(1), a contradiction. 


60 3. COMPUTABILITY 


2.2. Elementary Relations. A relation R on N’ is said to be elemen- 
tary if its characteristic function 


ig de SE R(ii) 
he) = if otherwise 


is elementary. In particular, the “equality” and “less than” relations are 
elementary since their characteristic functions can be defined as follows: 


ce(m,n) =1= (1+ (n= m)) ; ca(m,n) =1 (ce(mn) + ce(n,m))). 
Furthermore if R is elementary then so is the function 

f(ai,m) = wk<m Ri, k) 
since R(, k) is equivalent to 1 + cr(7i,k) = 0. 


LEMMA. The elementary relations are closed under applications of propo- 
sitional connectives and bounded quantifiers. 


PROOF. For example, the characteristic function of =R is 
1+ cr(n). 
The characteristic function of Ro A R, is 
CR, (7) - CR, (72). 
The characteristic function of Vi<m R(i,i) is 


ca(mM, pi<m. cr(ii,i) = 0). 


EXAMPLES. The above closure properties enable us to show that many 
“natural” functions and relations of number theory are elementary; thus 


[= = pwk<m(m < (k+1)n) 


mmodn =m + | |r 
n 


Prime(m) @ 1<m A 7dn<m(1 <nAm mod n= 0) 


Dn = pwm<2?" (Prime(m) A n= S- Pana) 
1<m 
SO po,P1,P2,---. gives the enumeration of primes in increasing order. The 


estimate pn < 2?” for the nth prime pp can be proved by induction on n: 
For n = 0 this is clear, and for n > 1 we obtain 


Dek poppet te OP 0? Lao had eo 


2.3. The Class €. 


DEFINITION. The class € consists of those number theoretic functions 
which can be defined from the initial functions: constant 0, successor S, 
projections (onto the ith coordinate), addition +, modified subtraction —, 
multiplication - and exponentiation 2”, by applications of composition and 
bounded minimization. 
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The remarks above show immediately that the characteristic functions 
of the equality and less than relations lie in €, and that (by the proof of the 
lemma) the relations in € are closed under propositional connectives and 
bounded quantifiers. 

Furthermore the above examples show that all the functions in the class 
€ are elementary. We now prove the converse, which will be useful later. 


LEMMA. There are “pairing functions” 7,171,172 in E with the following 
properties: 


(a) « maps N x N bijectively onto N, 


(b) m(a,b) < (a+b+1)?, 
(c) m1(c), m2(c) <c, 

(d) m(m1(¢), 72(c)) =, 
(e) m(n(a,b)) =a, 

(f) ma(m(a, b)) = 6. 


ProoF. Enumerate the pairs of natural numbers as follows: 


10 


6: = ek 

So if 

1 4 ae 
0 2 5 9 


At position (0,b) we clearly have the sum of the lengths of the preceeding 
diagonals, and on the next diagonal a + b remains constant. Let a(a,b) be 
the number written at position (a,b). Then we have 


n(a,b)=(S— 4) ta= 5(a+b)(atb+1) +a. 


i<a+b 


Clearly 7: N x N — N is bijective. Moreover, a,b < (a,b) and in case 
m(a,b) #0 also a < x(a, 6). Let 


m(¢) = px<cdy<c (n(x, y) =), 
m2(c) := py<cda<c(m(x,y) =c). 

Then clearly 7;(c) < c for 7 € {1,2} and 
m™1(n(a, b)) =a, 
2 (m(a, b)) = b, 
) 


m(m1(c), 72(c) 


7, 7 and m2 are elementary by definiton. 


REMARK. The above proof shows that 7, 7, and 7 are in fact subele- 
mentary. 

LEMMA (Gédel). There is in E a function ( with the following property: 
For every sequence ao,..-,@n—1 < b of numbers less than b we can find a 
number c<4-4%+"+))" such that B(c,t) =a; for alli<n. 
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PROOF. Let 
a:=m(b,n) and d:= I[@ + 1(a;,i)a!). 
i<n 
From a! and d we can, for each given i < n, reconstruct the number a; as 
the unique x < b such that 


1+ (a, i)a! | d. 
For clearly a; is such an x, and if some x < b were to satisfy the same 
condition, then because 7(x,i) < a and the numbers 1 + ka! are relatively 
prime for k < a, we would have m(a,i) = m(a;,j) for some 7 < n. Hence 
x =a; andi = J, thus z = aj. 
We can now define the Gédel G-function as 


B(c, i) = mi (uy<e. (1 + m(m(y), 4) - m(c)) + t2(y) = m2(c)). 
Clearly @ is in €. Furthermore with c := a(a!,d) we see that m(a;, [d/1 + 


m(a;,t)a!]) is the unique such y, and therefore G(c,i) = a;. It is then not 
difficult to estimate the given bound on ¢, using 7(b,n) < (b+n+1)?. 


REMARK. The above definition of 3 shows that it is subelementary. 


2.4. Closure Properties of €. 


THEOREM. The class E is closed under limited recursion. Thus if g, h, k 
are given functions in E and f is defined from them according to the scheme 


f(m,0) — =g(m) 
f(m,n +1) = h(n, f(m%, n), m) 
f (m,n) < k(m,n) 

then f is in E also. 

ProoF. Let f be defined from g, h and k in €, by limited recursion 
as above. Using Gédel’s G-function as in the last lemma we can find for 
any given 77,n a number c such that G(c,i) = f(m,7) for alli < n. Let 
R(m7,n, c) be the relation 

B(c,0) = gli) A Vi<n. B(e,i-+ 1) = Ali, Blc, i), 7) 
and note by the remarks above that its characteristic function is in €. It 
is clear, by induction, that if R(m,n,c) holds then {(c,i) = f(m,7), for all 
i <n. Therefore we can define f explicitly by the equation 
f(%,n) = B(uc RUM, n, c),n). 

f will lie in € if wc can be bounded by an € function. However, Lemma 2.3 
gives a bound 4- A(rt+1)(b+n-+2)* where in this case b can be taken as the 
maximum of k(m,i) for i <n. But this can be defined in € as k(17, ig), 
where ig = pi<n. Vj<n. k(m,7) < k(m,7). Hence yc can be bounded by an 
E function. 


REMARK. Notice that it is in this proof only that the exponential func- 
tion is required, in providing a bound for wp. 
COROLLARY. € is the class of all elementary functions. 
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PrRooF. It is sufficient merely to show that € is closed under bounded 
sums and bounded products. Suppose for instance, that f is defined from 
g in € by bounded summation: f(m,n) = >0,-,9(m,7). Then f can be 
defined by limited recursion, as follows 


f(m,0) = =0 
f (m,n) <n- max g(7, t) 


and the functions (including the bound) from which it is defined are in €. 
Thus f is in € by the last lemma. If instead, f is defined by bounded 
product, then proceed similarly. 


2.5. Coding Finite Lists. Computation on lists is a practical neces- 
sity, so because we are basing everything here on the single data type N 
we must develop some means of “coding” finite lists or sequences of natural 
numbers into N itself. There are various ways to do this and we shall adopt 
one of the most traditional, based on the pairing functions 7, 7, 7. 

The empty sequence is coded by the number 0 and a sequence no, 71, 

.., Np—1 is coded by the “sequence number” 


(no, M1, ++.) Mp1) = W'(... 1 (a (0, 20), 21), -- + ME-1) 
with z’(a, b) := 7(a,b) + 1, thus recursively, 
() :=0, 
igs i, es 9E) 2 Co teicsas DHE) 


Because of the surjectivity of 7, every number a can be decoded uniquely as 
a sequence number a = (no, 71,...,Nz-1). If a is greater than zero, hd(a) := 
m™2(a~ 1) is the “head” (i.e. rightmost element) and tl(a) := 71(a~= 1) is the 
“tail” of the list. The kth iterate of tl is denoted tl and since tl(a) is less 
than or equal to a, tl*)(a) is elementarily definable (by limited recursion). 
Thus we can define elementarily the “length” and “decoding” functions: 


Ih(a) := pk<a. tl (a) = 0, 
(a), = hd(tl"@=C+) (ay), 
Then if a = (no,71,...,M-1) it is easy to check that 
lh(a) = k and (a); = n; for each i < k. 


Furthermore (a); = 0 when i > lh(a). We shall write (a);,; for ((a);); and 
(a)i jn for (((a)i)j)x- This elementary coding machinery will be used at 
various crucial points in the following. 

Note that our previous remarks show that the functions lh and (a); are 


subelementary, and so is (m9, 71,...,x—1) for each fixed k . 
Concatenation of sequence numbers b x a is defined thus: 
bx () :=), 


bx (No, M1, -++,Ne) = 1(D* (Ng, M1,---;Me—-1), Me) + 1. 
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To check that this operation is also elementary, define h(b, a,i) by recursion 
on i as follows. 

h(b, a, 0) =), 

h(b,a,i-+ 1) = m(h(b,a,4), (a):) +1 
and note that since 7(h(b, 4,7), (a)i) < (h(b, a, i) +a)? it follows by induction 
on i that h(b, a, 7) is less than or equal to (b+a+i)”. Thus h is definable by 
limited recursion from elementary functions and hence is itself elementary. 
Finally 

bxa = h(b,a,|h(a)). 
LEMMA. The class E is closed under limited course-of-values recursion. 


Thus if h, k are given functions in E and f is defined from them according 
to the scheme 


f(m,n) = h(n, (f(m,0),....f (m,n — 1)), m) 
f(m,n) < k(m,n) 
then f is in E also. 


Proor. f(m,n) := (f(m,0),....f(m,n — 1)) is definable by 


3. The Normal Form Theorem 


3.1. Program Numbers. The three types of register machine instruc- 
tions I can be coded by “instruction numbers” {J thus, where vo, v1, v2,... 
is a list of all variables used to denote registers: 

If I is “uv; :=0” then {I = (0,9). 

TEE is. “oy:= vy + 1” then ff = (1,9). 

If lis “Gifv; =v, then I, else J,” then {J = (2,j,1,m,n). 
Clearly, using the sequence coding and decoding apparatus above, we can 
check elementarily whether or not a given number is an instruction number. 

Any register machine program P = Io, ),...,[,—1 can then be coded 
by a “program number” or “index” {P thus: 


{P = (Ho, tht, ae) tTn—1) 


and again (although it is tedious) we can elementarily check whether or not 
a given number is indeed of the form £P for some program P. Tradition has 
it that e is normally reserved as a variable over putative program numbers. 

Standard program constructs such as those in Section 1 have associated 
“index-constructors”, i.e. functions which, given indices of the subprograms, 
produce an index for the constructed program. The point is that for stan- 
dard program constructs the associated index-constructor functions are el- 
ementary. For example there is an elementary index-constructor comp such 
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that, given programs Po, P; with indices eg, e1, comp(eo, e1) is an index of 
the program Po ; P,. A moment’s thought should convince the reader that 
the appropriate definition of comp is as follows: 


comp(eo, €1) = eo * (r(€o, €1, 9), r(€0, €1, 1),---, 7(€0, e1, Ih(e1) + 1)) 


where r(eo, €1,7) = 


(2, (€1)i,1, (€1)i,2, (€1)i,3 + Ih(eo), (€1)i,4 + Ih(eo)) if (e1):,0 = 2 
(e1)s otherwise 


re-addresses the jump instructions in P,. Clearly r and hence comp are 
elementary functions. 


DEFINITION. Henceforth, ys? denotes the partial function computed by 


the register machine program with program number e, operating on the 
input registers v,,...,v, and with output register v9. There is no loss of 
generality here, since the variables in any program can always be renamed 
so that v1,...,v- become the input registers and vo the output. If e is not a 
program number, or it is but does not operate on the right variables, then 
we adopt the convention that yl? (n1,..-,7,) is undefined for all inputs 
N1,-++5Mp. 


3.2. Normal Form. 


THEOREM (Kleene’s Normal Form). For each arity r there is an ele- 
mentary function U and an elementary relation T such that, for all e and 
all inputs n1,...,Mr, 


° ob” (n1,..., nr) is defined — > AsT(e,n1,...,Nr, 8) 


e oh” (ny, ... 4M) = U(e,ny,..., Mp, UsT(e,1,...,Mr, 8) ). 


Proor. A computation of a register machine program P(v1,..., Ur} U0) 
on numerical inputs 7 = n1,...,n, proceeds deterministically, step by step, 
each step corresponding to the execution of one instruction. Let e be its 
program number, and let vo,...,v; be all the registers used by P, including 
the “working registers” so r < l. 

The “state” of the computation at step s is defined to be the sequence 
number 

state(e, 71,5) = (e,i,mo,7™1,..., m1) 
where mo,™1,...,™m are the values stored in the registers vo, v1,..., vj after 
step s is completed, and the next instruction to be performed is the ith one, 
thus (e); is its instruction number. 

The “state transition function” tr: N — N computes the “next state”. 
So suppose that x = (e,i,mo,m1,...,™) is any putative state. Then in 
what follows, e = (x)o, i = (x)1, and m; = (x)j+42 for each j < I. The 
definition of tr(x) is therefore as follows: 


td eee CR ec mem 9) 
where 


e If (e); = (0,7) where j <1 then i’ =i +1, mj; = 0, and all other 
registers remain unchanged, i.e. mj, = mx for k A j. 
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e If (e); = (1,9) where j <1 then ¢ =i +1, mi, = mj; +1, and all 
other registers remain unchanged. 

e If (e)i = (2, 50,91, 40,41) where jo, ji < J and ip,i; < Ih(e) then 
i! = io or i’ = i according as m;, = mj;, or not, and all registers 
remain unchanged, i.e. mi; =m, for all 7 < l. 

e Otherwise, if x is not a sequence number, or if e is not a program 
number, or if it refers to a register v, with | < k, or if lh(e) < 3, 
then tr(a) simply repeats the same state x so i’ = i, and m; =m; 


for every j < [. 
Clearly tr is an elementary function, since it is defined by elementarily decid- 
able cases, with (a great deal of) elementary decoding and re-coding involved 
in each case. 
Consequently, the “state function” state(e,7#,s) is also elementary be- 
cause it can be defined by iterating the transition function by limited recur- 
sion on s as follows: 


state(e, 71, 0) = (e,0,71,...,Mr,0,...,0) 
state(e, 7,5 + 1) = tr(state(e, 7, s) ) 
state(e, 7, s) < h(e, 7, s) 
where for the bounding function h we can take 
h(e, 7, s) = (e, e) x (max(7) + s,...,max(7) +s), 


This is because the maximum value of any register at step s cannot be 
greater than max(7) + s. Now this expression clearly is elementary, since 
(m,...,m) with i occurrences of m is definable by a limited recursion with 
bound (m+ )?’, as is easily seen by induction on 7. 

Now recall that if program P has program number e then computation 
terminates when instruction Jj,(c) is encountered. Thus we can define the 
“termination relation” T(e,7,s) meaning “computation terminates at step 
x", by 

T(e,7,s) <> (state(e, 7, s) )1 = Ih(e). 


Clearly T is elementary and 


y!") (7) is defined —> Is T(e, 7, 8). 


The output on termination is the value of register vg, so if we define the 
“output function” U(e, 7, s) by 


U(e, 7, s) = (state(e, 7, s) )2 
then U is also elementary and 


gh (i) = Ule, ft, ws T(e, i, s)). 


This completes the proof. 


3.3. )-Definable Relations and j-Recursive Functions. A rela- 
tion R of arity r is said to be ©9-definable if there is an elementary relation 
FE, say of arity r +1, such that for all 7 = n1,...,n,, 


R(it) ky... ak, E(t, k1,...,%). 
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A partial function ¢ is said to be >»? -definable if its graph 
{ (7%, m) | y(7) is defined and = m} 


is D9-definable. 

To say that a non-empty relation R is ©$-definable is equivalent to saying 
that the set of all sequences (7%) satisfying R can be enumerated (possibly 
with repetitions) by some elementary function f: N — N. Such relations are 
called elementarily enumerable. For choose any fixed sequence (a1,..., Gr) 
satisfying R and define 


fa) = oe if E((m)1,-.-5(m)n42) 


(G1, +++; Gr) otherwise. 


Conversely if R is elementarily enumerated by f then 
R(t) <> Am( fm) = (A) 
is a 59-definition of R. 

The p-recursive functions are those (partial) functions which can be 
defined from the initial functions: constant 0, successor S, projections (onto 
the ith coordinate), addition +, modified subtraction ~ and multiplication 
-, by applications of composition and unbounded minimization. Note that 
it is through unbounded minimization that partial functions may arise. 


LEMMA. Every elementary function is p-recursive. 


PRooF. By simply removing the bounds on yp in the lemmas in 2.3 
one obtains p-recursive definitions of the pairing functions 7, 71, 72 and of 
Godel’s 6-function. Then by removing all mention of bounds from Theorem 
in 2.4 one sees that the p-recursive functions are closed under (unlimited) 
primitive recursive definitions: f(m,0) = g(7%), f(m,n+1) = h(n, f(7,n)). 
Thus one can p-recursively define bounded sums and bounded products, and 
hence all elementary functions. 


3.4. Computable Functions. 


DEFINITION. The while-programs are those programs which can be built 
up from assignment statements « := 0, 7:= y, v:=y+1,x:=y- I, by 
Conditionals, Composition, For-Loops and While-Loops as in the subsection 
on program constructs in Section 1. 


THEOREM. The following are equivalent: 


) y is register machine computable, 
) y is O9-definable, 
(c) y is p-recursive, 
is computable by a while program. 
YP y 


ProoF. The Normal Form Theorem shows immediately that every re- 


gister machine computable function yl? is =9-definable since 


pg) (i) =m —> As.T(e, it, s) \U(e, i, 8) =m 


and the relation T(e, 7,5) A U(e,7,s) = m is clearly elementary. If ¢ is 
»)-definable, say 


pli) =m ky... dky E(vi,m, ky,..., ki) 
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then y can be defined pu-recursively by 


g(r) = (um E(n, (m)o,(m)1,---5(™m)1) Jo 
using the fact (above) that elementary functions are p-recursive. The exam- 
ples of computable functionals in Section 1 show how the definition of any 
p-recursive function translates automatically into a while program. Finally, 
the subsection on program constructs in Section 1 shows how to implement 
any while program on a register machine. 


Henceforth computable means “register machine computable” or any of 
its equivalents. 


COROLLARY. The function oh (ni, ..+)Nr) is a computable partial func- 


tion of the r+ 1 variables e,n1,...,Nr- 


PROOF. Immediate from the Normal Form. 


LEMMA. A relation R is computable if and only if both R and its com- 
plement N” \ R are ©}-definable. 


PROOF. We can assume that both R and N”\ R are not empty, and (for 
simplicity) also n = 1. 

=. By the theorem above every computable relation is ©9-definable, 
and with R clearly its complement is computable. 

<=. Let f,g € € enumerate R and N \ R, respectively. Then 


h(n) := pi.f(t) =nV gli) =n 


is a total y-recursive function, and R(n) @ f(h(n)) =n. 


3.5. Undecidability of the Halting Problem. The above corollary 
says that there is a single “universal” program which, given numbers e and 
n, computes oh” (it) if it is defined. However we cannot decide in advance 
whether or not it will be defined. There is no program which, given e and 
n, computes the total function 


1 if gh” (it) is defined, 
h(e,n) = 
(e,) fe if oh”) (it) is undefined. 


For suppose there were such a program. Then the function 
(fm) = ym (h(n1,7) = 0) 
would be computable, say with fixed program number eg, and therefore 
(a) = ‘° i Anise) =0 
undefined if h(n, 7%) = 1 
But then fixing n1 = eo gives: 


oh”) (#) defined + h(eo, 7) =0 <> y(t) undefined 
a contradiction. Hence the relation R(e, 7%) which holds if and only if yh? (77) 
is defined, is not recursive. It is however »)-definable. 
There are numerous attempts to classify total computable functions ac- 
cording to the complexity of their termination proofs. 


4. RECURSIVE DEFINITIONS 69 


4. Recursive Definitions 


4.1. Least Fixed Points of Recursive Definitions. By a recursive 
definition of a partial function y of arity r from given partial functions 
W1,---,Wm of fixed but unspecified arities, we mean a defining equation of 
the form 

p(n, ..-,Nr) = t(a1,.--,Ums 93 M1, +++) Np) 
where ¢ is any compositional term built up from the numerical variables 
n= nj1,...,N, and the constant 0 by repeated applications of the successor 
and predecessor functions, the given functions 7,...,W%m, the function y 
itself, and the “definition by cases” function : 


u if x,y are both defined and equal 
de(x,y,u,v) = <u if x,y are both defined and unequal 
undefined otherwise. 


Our notion of recursive definition is essentially a reformulation of the Her- 
brand-Gédel-Kleene equation calculus; see Kleene [15]. 

There may be many partial functions y satisfying such a recursive def- 
inition, but the one we wish to single out is the least defined one, i.e. the 
one whose defined values arise inevitably by lazy evaluation of the term t 
“from the outside in”, making only those function calls which are absolutely 
necessary. This presupposes that each of the functions from which t is con- 
structed already comes equipped with an evaluation strategy. In particular 
if a subterm dc(ty, tg, t3,t4) is called then it is to be evaluated according to 
the program construct: 


e:i=t)3; y:=to; if x:=y then ts else tq. 


Some of the function calls demanded by the term ¢t may be for further values 
of ¢ itself, and these must be evaluated by repeated unravellings of t (in other 
words by recursion). 

This “least solution” y will be referred to as the function defined by that 
recursive definition or its least fixed point. Its existence and its computabil- 
ity are guaranteed by Kleene’s Recursion Theorem below. 


4.2. The Principles of Finite Support and Monotonicity, and 
the Effective Index Property. Suppose we are given any fixed partial 
functions w1,...,%m and w, of the appropriate arities, and fixed inputs 7. 
If the term t = t(v1,...,Wm, 37) evaluates to a defined value & then the 
following principles are required to hold: 

Finite Support Principle. Only finitely many values of q1,...,%m and 
w are used in that evaluation of t. 

Monotonicity Principle. The same value k will be obtained no matter 
how the partial functions ~1,...,%m and w are extended. 

Note also that any such term t satisfies the 

Effective Index Property. There is an elementary function f such that if 
U1,..-,Wm and w are computable partial functions with program numbers 
€1,.--,@€m and e respectively, then according to the lazy evaluation strategy 
just described, 


t(w, os <9 Pit) 
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defines a computable function of 7 with program number f(e1,...,€m,€). 

The proof of the Effective Index Property is by induction over the build- 
up of the term t. The base case is where t is just one of the constants 0,1 
or a variable n;, in which case it defines either a constant function 7 +> 0 
or m + 1, or a projection function 7 +> nj. Each of these is trivially 
computable with a fixed program number, and it is this program number 
we take as the value of f(e1,...,@m,e). Since in this case f is a constant 
function, it is clearly elementary. The induction step is where ¢ is built up 
by applying one of the given functions: successor, predecessor, definition by 
cases or W (with or without a subscript) to previously constructed subterms 
ti(y1,.-.,Um, 037), = Lact: thus: 

t= W(t, a , ti). 

Inductively we can assume that for each 7 = 1...1, t; defines a partial 
function of 7 = n1,...,n,- which is register machine computable by some 
program P; with program number given by an already-constructed elemen- 
tary function f; = fi(e1,..-,@m,e). Therefore if ~ is computed by a program 
Q with program number e, we can put P,,...,P; and Q together to con- 
struct a new program obeying the evaluation strategy for t. Furthermore, 
by the remark on index-constructions near the beginning of Section 3, we 
will be able to compute its program number f(e1,...,@m,e) from the given 
numbers f1,..., f; and e, by some elementary function. 


4.3. Recursion Theorem. 


THEOREM (Kleene’s Recursion Theorem). For given partial functions 
U1,---,Wm, every recursive definition 


p(t) = t(v1,..., Um, 937) 
has a least fixed point, t.e. a least defined solution, y. Moreover if w1,...,Wm 
are computable, so is the least fixed point vp. 


PrRooF. Let W1,..-,%m be fixed partial functions of the appropriate 

arities. Let ® be the functional from partial functions of arity r to partial 
functions of arity r defined by lazy evaluation of the term ¢ as described 
above: 
Let 0, Y1, 2,--- be the sequence of partial functions of arity r generated 
by ® thus: go is the completely undefined function, and yi+1 = ®(y;) for 
each i. Then by induction on 7, using the Monotonicity Principle above, we 
see that each y; is a subfunction of y;41. That is, whenever y;(7) is defined 
with a value k then y;41(7) is defined with that same value. Since their 
defined values are consistent with one another we can therefore construct 
the “union” y of the y;’s as follows: 


pli) =k => Fi (yi(t) =k). 
(i) This vy is then the required least fixed point of the recursive definition. 
To see that it is a fixed point, i.e. p = ®(y), first suppose y(7) is defined 
with value k. Then by the definition of y just given, there is an i > 0 such 
that y;(7%) is defined with value k. But y; = ®(yi-1) so ®(yi_-1)(#) is 
defined with value k. Therefore by the Monotonicity Principle for ®, since 
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yi-1 is a subfunction of y, ®(y)(7#) is defined with value k. Hence ¢ is a 
subfunction of ®(y). 

It remains to show the converse, that ®(y) is a subfunction of y. So sup- 
pose ©(y)(7) is defined with value k. Then by the Finite Support Principle, 
only finitely many defined values of y are called for in this evaluation. By 
the definition of y there must be some 7 such that y; already supplies all of 
these required values, and so already at stage i we have ®(y;) (71) = yi41(7) 
defined with value k. Since y;41 is a subfunction of y it follows that y(7) 
is defined with value k. Hence ®(y) is a subfunction of y. 

To see that y is the least such fixed point, suppose y’ is any fixed point 
of ®. Then ®(y’) = y’ so by the Monotonicity Principle, since yp is a 
subfunction of vy’ it follows that (yo) = y) is a subfunction of &(y’) = yy’. 
Then again by Monotonicity, ®(y1) = ye is a subfunction of ®(y’) = y’ 
etcetera so that for each i, y; is a subfunction of y’. Since vy is the union of 
the y;,’s it follows that y itself is a subfunction of y’. Hence ¢ is the least 
fixed point of ®. 

(ii) Finally we have to show that y is computable if the given functions 
U1,...,Wm are. For this we need the Effective Index Property of the term 
t, which supplies an elementary function f such that if 7 is computable 
with program number e then (7) is computable with program number 
f(e) = f(e1,...,@m,e). Thus if u is any fixed program number for the 
completely undefined function of arity r, f(w) is a program number for 
y1 = ®(yo), f2(u) = f(f(u)) is a program number for yo = &(y1), and in 
general f*(u) is a program number for y;. Therefore in the notation of the 
Normal Form Theorem, 

pil) = oF y (A) 
and by the second corollary to the Normal Form Theorem, this is a com- 
putable function of i and 7, since f*(u) is a computable function of i defin- 
able (informally) say by a for-loop of the form “for 7 =1...2 do f od”. 
Therefore by the earlier equivalences, y;(7) is a »)-definable function of 2 
and 7, and hence so is itself because 


p(t) =m => Fi( y(t) =m). 


So y is computable and this completes the proof. 


NoTE. The above proof works equally well if y is a vector-valued func- 
tion. In other words if, instead of defining a single partial function y, the 
recursive definition in fact defines a finite list ¢ of such functions simultane- 
ously. For example, the individual components of the machine state of any 
register machine at step s are clearly defined by a simultaneous recursive 
definition, from zero and successor. 


4.4. Recursive Programs and Partial Recursive Functions. A 
recursive program is a finite sequence of possibly simultaneous recursive 
definitions: 
0(M1,---, ro) = to(Po; M1, +++ Mro) 

(ni, - ++, Mr) = t1(Bo, P13 M1, --- Mr) 


(ni, + ey.) = to(Go, Pi; Po3n1, oe +; Nr) 
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Pr(ni, aes iT) = te (Go, hs Pri; Pri Nipiercens (as Je 
A partial function is said to be partial recursive if it is one of the functions 
defined by some recursive program as above. A partial recursive function 
which happens to be totally defined is called simply a recursive function. 
THEOREM. A function is partial recursive if and only if it is computable. 


PrRooF. The Recursion Theorem tells us immediately that every partial 
recursive function is computable. For the converse we use the equivalence of 
computability with ju-recursiveness already established in Section 3. Thus 
we need only show how to translate any p-recursive definition into a recursive 
program: 

The constant 0 function is defined by the recursive program 

p(7i) = 0 
and similarly for the constant 1 function. 

The addition function y(m,n) = m+n is defined by the recursive pro- 
gram 

p(m,n) = de(n,0,m, y(m,n + 1) +1) 
and the subtraction function y(m,n) = m ~ n is defined similarly but with 
the successor function +1 replaced by the predecessor +1. Multiplication is 
defined recursively from addition in much the same way. Note that in each 
case the right hand side of the recursive definition is an allowed term. 

The composition scheme is a recursive definition as it stands. 

Finally, given a recursive program defining w, if we add to it the recursive 
definition: 


followed by 


= (n,m) if d(#,m—1) £0 
=m if ~(7,m) =0 
Thus the recursive program for y’ defines unbounded minimization: 
¢'() = pm (W(7i,m) = 0). 
This completes the proof. 


CHAPTER 4 


Godel’s Theorems 


1. G6del Numbers 


1.1. Coding Terms and Formulas. We use the elementary sequence- 
coding and decoding machinery developed earlier. Let £ be a countable first 
order language. Assume that we have injectively assigned to every n-ary 
relation symbol R a symbol number SN(R) of the form (1,n,7) and to every 
n-ary function symbol f a symbol number SN(f) of the form (2,7, 7). Call 
L elementarily presented, if the set Symb; of all these symbol numbers is 
elementary. In what follows we shall always assume that the languages L 
considered are elementarily presented. In particular this applies to every 
language with finitely many relation and function symbols. 

Assign numbers to the logical symbols by SN(A) := (3,1), SN(—) := 
(3, 2) und SN(V) := (3,3), and to the i-th variable assign the symbol number 
(0,2). 

For every £-term t we define recursively its Gédel number "t! by 

hae := (SN(zx)), 
Ee} := (SN(c)), 
U fiz... .tyh = (SN(f), te |. 2 bn. 
Similarly we recursively define for every £-formula A its Gédel number "A! 
by 
F Reps. ty = (SNR) Pty esta} 
TAA B?  :=(SN(A),°AB)), 
TAB’ := (SN(->),"AT,B"), 
"Va Al := (SN(V),"a1,"A7). 
Let Var := { ((0,2)) |i € N}. Var clearly is elementary, and we have a € Var 
if and only if a = "a! for a variable x. We define Ter C N as follows, by 
course-of-values recursion. 
aeé Ter: 

a € Var V 

((a)o = Symb; /\ (a)o,0 =2A lh(a) = (a)o1 +1A Vig<i<th(a) (a); E Ter). 
Ter is elementary, and it is easily seen that a € Ter if and only if a ="t" for 
some term t. Similarly For C N is defined by 
a € For: 

((a)o € Symb¢ A (a)o,0 = 1A th(a) = (a)o,1 + 1A Vincicth(a) (@)i € Ter) V 
(a = (SN(A), (@)1, (a@)2) A (a)1 € For A (a)2 € For) V 
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(a = (SN(-), (a)1, (@)2) A (a)1 € For A (a)2 € For) V 

(a = (SN(V), (@)1, (a)2) A (a)i € Var A (a)2 € For). 
Again For is elementary, and we have a € For if and only if a = 'A’ for 
some formula A. For a set S of formulas let "S7:= {"A™| AES}. 

We could continue in this way and define Gédel numberings of various 
other syntactical notions, but we are mainly concerned that the reader be- 
lieves that it can be done rather than sees all the (gory) details. In particular 
there are elementary functions msm and sub with the following properties: 

e msm('T''," A’) codes the result of deleting from the multiset of 
formulas [I all the formulas which lie in A. 

e sub(Tt1," 8") ="tv), and sub("A1,"0") ="AV"", where ¥ is a sub- 
stitution (i.e. a finite assignment of terms to variables) and t? and 
Av are the results of substituting those terms for those variables in 
t or A respectively. 


1.2. Sequents. In our previous exposition of natural deduction one can 
find the assumptions free at a given node by inspecting the upper part of 
the proof tree. An alternative is to write the free assumptions next to each 
node, in the form of a multiset. 

By a sequent [ = A we mean a pair consisting of a multiset [T = 
{{A1,...,An}} of formulas and a single formula A. We define +, T = A 
inductively by the following rules. An assumption can be introduced by 


TsS>A ifAinT. 
For conjunction A we have an introduction rule AI and two elimination rules 
AF, und AF;. 
T=A A=B, Po ANB, T>AAB 
r,AS>AAB PSA : r>+B 
Here I’, A denotes multiset union. For implication — we have an introduc- 


tion rule >I (not mentioning an assumption variable uw) and an elimination 
rule -E. 


AE} 


Tl=B TS>A-B ASA 
AxsA—-B TASB 
In —I the multiset A is obtained from [ by cancelling some occurrences 
of A. For the universal quantifier V we have an introduction rule VI and 
an elimination rule VE (formulated without the term t to be substituted as 
additional premise) 


E 


T>A TsVarA VE 
T=>Va2A [=> Alz :=t] 

In VI the variable condition needs to hold: for all B in I we must have 
x ¢ FV(B). 

LEMMA. (a) [ffm {{A1,...,An}} => A, then for all (not necessarily dis- 

tinct) U1,...,Un such that uj = uj > A; = A; we can find a derivation 

term M4lus,..., ud]. 
(b) For every derivation term MAtus, ..., un] one can find multiplicities 


ky,...,kn > 0 such that Fm {{AM, ..., Aku} => A; here A* means a 
k-fold occurrence of A. 
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PRoorF. (a). Assume Fm {{A1,...,An}} = A. We use induction on Fim. 
Case assumption. Let A = A;. Take M = uj. 

Case —I. We may assume 

{{A1,..., An, A,..., A}} = B 
{{A1,...,An}} = A B 

Let uj,...,Un be given such that u, = u; —- A; = Aj. Pick a new as- 
sumption variable u. By IH there exists an M? such that FA(M?) C 
fan ...,udn uA}. Then (Au4 M*®)4~° is a derivation term with free as- 
sumptions among ul, ude 


»4Un - 
Case —E. Assume we have derivations of 


f{A1,...,4n}} > A> B and {{Apsi,...,Anim}} > A. 


I. 


Let u1,...,Un4m be given such that uj = uj - A; = A;. By IH we have 
derivation terms 
M4 Alaa uae] and NAlus a eee a) 
But then also 
B An+tm 
(MN) [uj ’ > Untm | 


is a derivation term. 
The other cases are treated Stel: 
(b). Let a poovaney term M4[u#,.. 
on M. Case u4. Then ky, {{A}} = A. 
Case —I, so (Au4M®)478. Let FA(M?) C Cree usm uAl with 
U1,---,Un,u distinct. By TH we have 
Prt see gt A Se 
Using the rule —I we obtain 
tm ({{A™,..., A} > A B. 
Case —E. We are given (MA7B NA) ys | ...,uéln], By TH we have 
bm ({Ap,..., A> ASB and tp {{Ab,...,A2} > A. 
Using the rule —E we obtain 
{AR Ape eS Bs 


The other cases are treated similarly. 


., uA] be given. We use induction 


1.3. Coding Derivations. We can now define the set of Gédel num- 
bers of formal proofs (in the above formulation of minimal logic), as follows. 


Deriv(d) : Vi<lh(d). 


(Vm<Ih((d)i,0) For((d)i,0m) A dn<Ih((d)i,o) ((d)i,1 = (4)i,0,n)) (A) 
V (Aj, k<t-(d)i,1 = (SN(A), (Q)j,1, (Dar) A (dio =m (4)j,0 * (deo) (AD) 
V (Aj<t.(d)j,1 = (SN(A), (it, (25,12) A (@io =m (4)j,0) (AE;y) 
V (Aj <t.(@)j.1 = (SN(A), (@)j,1,1, (@i1) A @io =m (4)j,0) (AE) 
V (Aj<i.(d)ia = (SN(—), (@)i,a,1, (2)5,1) A For((@)i1,1) (>I) 
A msm((d);,0, (d)j,0) = 0 
A Yin<lh(msm((d) 5,0, (d)i,o)) ((msm((d);,0, (@)i.o))n = (4)i,1,1)) 
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V (Ag, k<t.(d)j,1 = (SN(>), (@)a,ts (Dir) A (Ai,o =m (@)j,0 * (A)ko) (> E) 

V (Aj<t.(@ia = (SN(V), (A)iaa, (251) A Var((d)i,1,1) (v1) 
\ (d)i,o =m (d)j,0 A ¥n<lh((d)i,0) aFV((4)i,1,1, (2)i,on)) 

V (Aj <t.(d)5,1,0 = SN(V) A (d)i,o =m (4)3,0 (VE) 


A ((d)ia = (d);1,2 Vv An<(d);1.Ter(n) 
\ (d)ia = sub((d)j,1,2, (((d)5,1,15")))))- 


Note that (1) this clearly defines an elementary set, and (2) if one carefully 
reads the definition, then it becomes clear that d is in Deriv iff d codes a 
sequence of pairs (sequents) [; = A; with T; a multiset, such that this 
sequence constitutes a derivation in minimal logic, i.e. each sequent is either 
an axiom or else follows from previous sequents by a rule. Thus 


LEMMA. 
(a) Deriv(d) if and only if d is the Godel number of a derivation. 
(b) Deriv is elementary. 


1.4. Axiomatizable Theories. A set $ of formulas is called recursive 
(elementary, ©$-definable), if "S71 := {"A7| A € S} is recursive (elemen- 
tary, ©}-definable). Clearly the sets Stabax¢ of stability axioms and Eq; of 
£-equality axioms are elementary. 

Now let £ be an elementarily presented language with = in £. A theory 
T with L(T) C CL is called recursively (elementarily) aziomatizable, if there 
is a recursive (elementary) set S of closed £-formulas such that T = { A € 
L\SUEge ke A}. 

THEOREM. For theories T with L(T) CL the following are equivalent. 


(a) T is recursively axiomatizable. 
(b) T is elementarily axiomatizable. 
(c) T is ©$-definable. 


PrRooF. (c) => (b). Let "I be S}-definable. Then by Section 2.5 
of Chapter 3 there exists an f € € such that "T! = ran(f), and by the 
argument there we can assume f(n) < n for all n. Let f(n) ="A,'. We 
define an elementary function g with the property g(n) = "Ap A---A An! 
by 

90) = f(0), 
g(n+1) = g(n)A fin+1), 
where a A b := (SN(A),a,b). Clearly g can be bounded in €. For S$ := 
{Ao A-+:A An | 2 € N} we have "S77 = ran(g), and this set is elementary 
because of a € ran(g) @ dn<a(a = g(n)). T is elementarily axiomatizable, 
since T= {AE L|SUEqe +, A}. 
(b) = (a) is clear. 
(a) = (c). Let T be axiomatized by S with "S” recursive. Then 
a €'T <> Adic<d.Deriv(d) A (d)in(a)-1 = (Cc, @) A 
Vi<lh(c) ((c); € "Stabax’UTEq'U"S')). 


Hence 'T'! is »)-definable. 
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A theory T in our elementarily presented language £ is called axioma- 
tized, if it is given by a U-definable axiom system Axr. By the theorem just 
proved we can even assume that Axr is elementary. For such axiomatized 
theories we define Prfp C N x N by 

Prfr(d, a) :«> Deriv(d) A Ac<d.(d)in(ay=1 = (¢, a) A 
Vi<lh(c) ((c); € "Stabax1U Eq! UT Ax). 


Clearly Prfr is elementary and Prf;(d, a) if and only if d is a derivation of 
a sequent [ = A with [ composed from stability axioms, equality axioms 
and formulas from Axr, anda =' A’. 

A theory T is called consistent, if there is a closed formula A such that 
A ¢ T; otherwise T is called inconsistent. 


COROLLARY. Every axiomatized complete theory T is recursive. 


ProoF. If T is inconsistent, then '7”' is recursive. If not, then from the 
completeness of T we obtain 
a€N\'T oa é¢ ForV ab<aFV(b,a) VaaEeTT, 
where 4a := a>" 17 and a— b := (SN(—),a,b). Hence with "T” also 
N\"T" is 9-definable and therefore "T” is recursive. 


2. Undefinability of the Notion of Truth 


Recall the convention in 1.2 of Chapter 1: once a formula has been 
introduced as A(x), i.e., A with a designated variable x, we write A(t) for 
Ala := t], and similarly with more variables. 


2.1. Definable Relations. Let M be an £-structure. A relation R C 
|M|” is called definable in M if there is an £-formula A(x1,...,2n) with 
only the free variables shown such that 


R={(qq,...,4n) € |M|” | M E Alay,..., Gp] }. 


We assume in this section that |M| = N, 0 is a constant in £ and S is a 
unary function symbol in £ with 0 = 0 and S“(a) = a+1. Then for every 
a € N we can define the numeral a € Tere by 0 := 0 and a+1 := S(a). 
Observe that in this case the definability of R C N” by A(a1,...,2n) is 
equivalent to 


Raf (Greecga)y EN MP Al@ie tas) ts 


Furthermore let £ be an elementarily presented language. We shall always 
assume in this section that every elementary relation is definable in M. A 
set S of formulas is called definable in M, if "S71 := {"A1| Ae Sh is 
definable in M. 

We shall show that already from these assumptions it follows that the 
notion of truth for M, more precisely the set Th(M) of all closed formulas 
valid in M, is undefinable in M. From this it will follow in turn that the 
notion of truth is in fact undecidable, for otherwise the set Th(M) would 
be recursive (by Church’s Thesis), hence »)-definable, and hence definable, 
because we have assumed already that all elementary relations are definable 


in M. 
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2.2. Fixed Points. For the proof we shall need the following lemma, 
which will be generalized in the next section. 


LEMMA (Semantical Fixed Point Lemma). If every elementary relation 
is definable in M, then for every L-formula B(z) with only z free we can 
find a closed £L-formula A such that 


MEA _ if and only if ME BIA. 
PROOF. We define an elementary function s by 
8(b, k) := sub(b, (27, k ))). 


Here z is a specially given variable determined by B(z), say *9. Then for 
every formula C(z) we have 


6(TC)" k) = sub("C"), (Cael, Tk1))) = C(k)), 
hence in particular 
s("C1F,"C)) = oe ORs 


By assumption the graph Gs, of s is definable in M, by As(21, x2, 23) say. 
Let 


Cla) = Ag Br) AAS, 2:6), 


so 
A= 42.B(x) \ As(CC1,°C)" 2). 


Hence M - A if and only if da€N.M — Bla] and a="C("C’)", so if and 
only if M — Bl A]. 


2.3. Undefinability. We can now prove the undefinability of truth. 


THEOREM (Tarski’s Undefinability Theorem). Assume that every ele- 
mentary relation is definable in M. Then Th(M) is undefinable in M, 
hence in particular not ©9-definable. 


Proor. Assume that “Th(M)" is definable by Byw(z). Then for all 
closed formulas A 


MEA _ if and only if ME By A. 


Now consider the formula ~By(z) and choose by the Fixed Point Lemma 
a closed £-formula A such that 


ME A_ ifand only if ME-7By["A). 


This contradicts the equivalence above. 
We already have noticed that all ©9-definable relations are definable in 
M. Hence it follows that "Th(M)7 cannot be ©9-definable. 
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3. The Notion of Truth in Formal Theories 


We now want to generalize the arguments of the previous section. There 
we have made essential use of the notion of truth in a structure M, i.e. of 
the relation M | A. The set of all closed formulas A such that M — A has 
been called the theory of M, denoted Th(M). 

Now instead of Th(M) we shall start more generally from an arbitrary 
theory T. We shall deal with the question as to whether in T there is a notion 
of truth (in the form of a truth formula B(z)), such that B(z) “means” that 
zis “true”. 

What shall this mean? We have to explain all the notions used without 
referring to semantical concepts at all. 


e z ranges over closed formulas (or sentences) A, or more precisely 
over their Gédel numbers ' A’. 

e A “true” is to be replaced by TF A. 

e C “equivalent” to D is to be replaced by TF Co D. 


We want to study the question as to whether it is possible that a truth 
formula B(z) exists, such that for all sentences A we have T+ A B(TA’). 
The result will be that this is impossible, under rather weak assumptions on 
the theory T. 


3.1. Representable Relations. Technically, the issue will be to re- 
place the notion of definability by the notion of “representability” within a 
formal theory. 

Let £ again be an elementarily presented language with 0,5, = in £ and 
T be a theory containing the equality axioms Eq;. 


DEFINITION. A relation R C N” is representable in T if there is a formula 
A(@1,...,%n) such that 


TP Al Gist. Og), IP (Gi o.t On) SR; 
PRA Gig cisylig)y Tl Cty cc Gy) eB 


A function f: N° — N is called representable in T if there is a formula 
A(w1,---,;2n,y) representing the graph Gy C N"*! of f, i-e., such that 


(16) Altay ita Bay F Oty taney Gig) 

(17) Tt -7AA(ai,...,@n,€), ifc A f(ai,.--,@n) 
and such that in addition 

(18) 


TE A(ai,...,@n,y) ~ A(ai,..-,@n,2) > y = for all ai,...,an EN. 


Notice that in case T+ b 4 c for b < c the condition (17) follows from 
(16) and (18). 

LEMMA. If the characteristic function cp of a relation R C N” is repre- 
sentable in T, then so is the relation R itself. 


ProoF. For simplicity assume n = 1. Let A(a,y) be a formula rep- 
resenting cr. We show that A(z,1) represents the relation R. So assume 
a € R. Then cr(a) = 1, hence (a,1) € Geg, hence T+ A(a,1). Conversely, 
assume a ¢ R. Then cr(a) = 0, hence (a, 1) ¢ Ge,, hence T + —A(a, 1). 


80 4. GODEL’S THEOREMS 


3.2. Fixed Points. We can now prove a generalized (syntactical) ver- 
sion of the Fixed Point Lemma above. 


LEMMA (Fixed Point Lemma). Assume that all elementary functions 
are representable inT. Then for every formula B(z) with only z free we can 
find a closed formula A such that 


TEAS BLA). 


PROOF. We start as in the proof of the Semantical Fixed Point Lemma. 


Let As(21, 22,23) be a formula which represents the elementary function 
s(b, k) := sub(b, ("2 ',"k"))). Let 


x.B(x) \ Ag(z, 2,2), 
); 


A= ag. Bla) AAs Co) UC @). 
Because of s("C7,"C?) ="C(“C7)1=" A" we can prove in T 
A,(1C),"C)»,2) o2=TAl, 
hence by definition of A also 


Aco dr.B(r)Ar=TAl 


and hence 


Notice that for T = Th(M) we obtain the above (semantical) Fixed 
Point Lemma as a special case. 


3.3. Undefinability. Using the Fixed Point Lemma above, we can 
generalize the undefinability result as well. 


THEOREM (Undefinability of the Notion of Truth). Let T be a consistent 
theory such that all elementary functions are representable in T. Then there 
cannot exist a formula B(z) with only z free defining the notion of truth, 
i.e. such that for all closed formulas A 


TEAS B(TA. 


Proor. Assume we would have such a B(z). Consider the formula 
—B(z) and choose by the Fixed Point Lemma a closed formula A such that 


For this A we have T+ A —A, contradicting the consistency of T’. 


For T = Th(M) Tarski’s Undefinability Theorem is a special case. 
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4. Undecidability and Incompleteness 


In this section we consider a consistent formal theory T with the property 
that all recursive functions are representable in T. This is a very weak 
assumption, as we shall show in the next section: it is always satisfied if the 
theory allows to develop a certain minimum of arithmetic. 

We shall show that such a theory necessarily is undecidable. Moreover 
we shall prove Gédel’s First Incompleteness Theorem, which says that every 
axiomatized such theory must be incomplete. We will also prove a sharp- 
ened form of this theorem due to Rosser, which explicitely provides a closed 
formula A such that neither A nor —A is provable in the theory T. 


4.1. Undecidability; G6del’s First Incompleteness Theorem. 
Let again £ be an elementarily presented language with 0,S,= in £, and T 
be a theory containing the equality axioms Eq;. 


THEOREM. Assume that T is a consistent theory such that all recursive 
functions are representable in T. Then T is not recursive. 


Proor. Assume that T is recursive. By assumption there exists a for- 
mula B(z) in representing "T' in T. Choose by the Fixed Point Lemma in 
3.2 a closed formula A such that 


We shall prove («) TV A and (**) TF A; this is the desired contradiction. 
Ad (*). Assume T+ A. Then A € T, hence "A? € 'T™, hence T + 
B("A7) (because B(z) represents in T the set "T'). By the choice of A it 
follows that T+ =A, which contradicts the consistency of T. 
Ad (*«*). By (*) we know Tb’ A. Therefore A ¢ T, hence "A ¢ T" 
and hence T’+ —=B(A"). By the choice of A it follows that TF A. 


THEOREM (Godel’s First Incompleteness Theorem). Assume that T is 
an axiomatized consistent theory with the property that all recursive func- 
tions are representable in T. Then T is incomplete. 


ProoF. This is an immediate consequence of the above theorem and 
the corollary in 1.4. 


4.2. Rosser’s Form of Gédel’s First Incompleteness Theorem. 
As already mentioned, we now want to sharpen the Incompleteness Theorem, 
by producing a formula A such that neither A nor =A is provable. The 
original idea is due to Rosser. 

THEOREM (Gédel-Rosser). Let T be an axiomatized consistent L-theory 
with 0,5,= in Land Eqe C T. Assume that there is a formula L(x,y) — 
written « <y — such that 


(19) TEVaa<a>xr=O0V---Vx=a-l1, 
(20) ThEVanx=OV--Va=aVva<u. 


Moreover assume that every elementary function is representable inT. Then 
we can find a closed formula A such that neither A nor =A is provable in 
T. 
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PROOF. We first define Refuty C N x N by 
Refutr(d, a) : Prfr(d, 7a). 


So Refuty is elementary, and we have Refut;(d,a) if and only if d is a refu- 
tation of a in T, ie., d is a derivation of a sequent T = —A coded by 
a = '-7A' and [ is composed from stability axioms and formulas from 
Axr. Let Bprt,(v1,%2) and Brefut;(v1,%2) be formulas representing Prfr 
and Refutr, respectively. Choose by the Fixed Point Lemma in 3.2 a closed 
formula A such that 


TrFAS Vi Beppe Ao) =? dy.y <a Brefutr (y, A"). 


So A expresses its own underivability, in the form (due to Rosser) “For every 
proof of me there is a shorter proof of my negation”. 

We shall show (*) T 1’ A and (**) T 1 aA. Ad (x). Assume TF A. 
Choose a such that 


Prfr(a," A"). 


Then we also have 
not Refutr(b," A‘) for all b, 
since T is consistent. Hence we have 


T+ Bert, (a, A"), 
T a Breng (6,249) for all b. 


By (19) we can conclude 


T+} Bprp(a,, A!) AVy.y < a — 7Brefutr(y,- A!) 


Hence we have 


T da. Bprtp (ate (A?) A Vy-y <r —BrRefutp (guar) 
TE-AA. 


This contradicts the assumed consistency of T. 
Ad (x). Assume T+ —A. Choose a such that 


Refutr(a," A"). 
Then we also have 

not Prfr(b," A‘) for all b, 
since T is consistent. Hence we have 


Tr BRetuty (a, EAS 
T + =Bprt,(b, A!) for all b. 


But this implies 


TPY¥e Beg, (eA) > ayy <2 A Beinn yA), 


as can be seen easily by cases on x, using (20). Hence T+ A. But this again 
contradicts the assumed consistency of T. 
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4.3. Relativized Godel-Rosser Theorem. Finally we formulate a 
variant of this theorem which does not assume any more that the theory T’ 
talks about numbers only. 

THEOREM (Gédel-Rosser). Assume that T is an axiomatized consistent 
L-theory with 0,S,= in Land Eqe CT. Furthermore assume that there are 
formulas N(x) and L(x,y) - written Nx and x < y — such that T+ NO, 
TEVarEN N(S(x)) and 


THKVaEeN.2c <a>-r=0V::-Va=a-—l, 
TEVreEN.c =O0V---Va=aVa<e. 


Here VxEN A is short for Vx.Nx — A. Moreover assume that every ele- 
mentary function is representable in T. Then one can find a closed formula 
A such that neither A nor 4A is provable in T. 


ProoF. As before; just relativize all quantifiers to NV. 


5. Representability 


We show in this section that already very simple theories have the prop- 
erty that all recursive functions are representable in them. 


5.1. A Weak Arithmetic. It is here where the need for Gédel’s G- 
function arises: Recall that we had used it to prove that the class of recursive 
functions can be generated without use of the primitive recursion scheme, 
i.e. with composition and the unbounded p-operator as the only generating 
schemata. 


THEOREM. Assume that T is an £L-theory with 0,S,= in £L and Eq, CT. 
Furthermore assume that there are formulas N(x) and L(a,y) — written Na 
andx <y — such that T+ NO, TE VxEN N(S(x)) and the following hold: 


(21) THS(a) 40 for alla EN, 
(22) TtrS(e)=S(b) a=) for alla,b EN, 
(23) the functions + and - are representable in T, 

(24) THVxrEN (x £0), 

(25) TEVrEeN.at < S(b) - a <bVax=b for allb EN, 
(26) TEVxEeN.a <bVa=bVb<u2 for allbEN. 


Here againVxEN A is short forVax.Nax > A. Then T fulfills the assumptions 
of the theorem in 4.8., t.e., the Godel-Rosser Theorem relativized to N. In 
particular we have, for alla Ee N 


(27) TKVrEeN.a<a>xr=0V-:-Ve=a-l1, 
(28) TEVrEeN.2 =OV-+-Ve=aVa<a, 


and every recursive function is representable in T. 


PRoor. (27) can be proved easily by induction on a. The base case 
follows from (24), and the step from the induction hypothesis and (25). 
(28) immediately follows from the trichotomy law (26), using (27). 

For the representability of recursive functions, first note that the for- 
mulas « = y and x < y actually do represent in T the equality and the 
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less-than relations, respectively. From (21) and (22) we can see immediately 
that T-}a#bwhena#b. Assume a ¢ b. We show TF a ¢ b by induction 
on b. Tt a ¢ 0 follows from (24). In the step we have a ¢ b+ 1, hence 
a <b and a F Bb, hence by induction hypothesis and the representability 
(above) of the equality relation, T+ a ¢ band T+ a F b, hence by (25) 
Tt a¢S(b). Now assume a <b. ThenTFaA#AbandTt b €a, hence by 
(26) TFa<b. 

We now show by induction on the definition of j-recursive functions, 
that every recursive function is representable in T. Observe first that the 
second condition (17) in the definition of representability of a function au- 
tomatically follows from the other two conditions (and hence need not be 
checked further). This is because if c # f(a,,...,@,) then by contraposing 
the third condition (18), 


Trees f(a,...,an) ~ A(a,..-,@n, f(a1,.-.,4n)) — AA(a1, -.., an, ©) 


and hence by using representability of equality and the first representability 
condition (16) we obtain T+ 7A(aj,...,@n,€) 

The initial functions constant 0, successor and projection (onto the i- 
th coordinate) are trivially represented by the formulas 0 = y, S(x) = y 
and x; = y respectively. Addition and multiplication are represented in 
T by assumption. Recall that the one remaining initial function of p- 
recursiveness is +, but this is definable from the characteristic function 
of < by a+ b=pi.b+i>a=pi.cc(b+i,a) =0. We now show that the 
characteristic function of < is representable in T. (It will then follow that 
~ is representable, once we have shown that the representable functions are 
closed under ju.) So define 


A(x1,%2,y) = (41 <t2Ay=1)V (a1 K t2aAy =D). 


Assume a, < ag. Then TF ay < ag, hence T+ A(ai,a2,1). Now assume 
a, fag. Then T+ ay € ag, hence T + A(ay,a2,0). Furthermore notice 
that A(x1,22,y) A A(x1, 22,2) — y = 2 already follows logically from the 
equality axioms (by cases on 21 < £2). 


For the composition case, suppose f is defined from h, 91,.--,9m by 
f(a) = h(gi(@),.--, 9m(@)). 
By induction hypothesis we already have representing formulas Ay, (Z, y;) 
and A;,(y,z). As representing formula for f we take 
As = Y-Ag, (z, y1) KPuEN Agn, (Z, Ym) aX An(¥; z). 
Assume f(@) = c. Then there are b,...,bm such that TF Ag, (a, b;) for each 


i, and TF An(b, c) so by logic TF Af(G,c). It remains to show uniqueness 
T+ Af(G, 21) — Af(G, 22) — 21 = 2. But this follows by logic from the 
induction hypothesis for g;, which gives 


TF Ag, (G, y13) — Ag, (G, yori) > yri = You = 9i(G) 


and the induction hypothesis for h, which gives 
TH An(b, 21) => An(b, 22) > 24=22 with db = gi(@). 


For the 41 case, suppose f is defined from g (taken here to be binary for 
notational convenience) by f(a) = pi (g(i,a) = 0), assuming Vali (g(t, a) = 
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0). By induction hypothesis we have a formula A,(y, x, z) representing g. 
In this case we represent f by the formula 


Ap(z,y) = NyA Ag(y, 2,0) AVuEN.u < y > du.u40A Ag(v, 2, u). 


We first show the representability condition (16), that is T+ Af(a,b) when 
f(a) = b. Because of the form of Af this follows from the assumed repre- 
sentability of g together with TF u<b—-v=O0V-:-Vu=b-1. 

We now tackle the uniqueness condition (18). Given a, let b:= f(a) (thus 
g(b, a) = 0 and b is the least such). It suffices to show T+ Af(a,y) — y = 8, 
and we do this by proving T + y < b — 7AAf(a,y) and TF b<y > 
“A (a,y), and then appealing to the trichotomy law. 

We first show T + y < b — 7Af(a,y). Now since, for any i < 8, 
T + 7=A,(i, a, 0) by the assumed representability of g, we obtain immediately 
T | 7Af(a,i). Hence because of TF y< b> y=O0V---Vy=b—1 the 
claim follows. 

Secondly, T + b < y — 7AAf(a,y) follows almost immediately from 
Thkb<y-— As(a,y) > Ju.u 4 0A A,(b,a,u) and the uniqueness for g, 
Tt A,(b,a,u) + u=0. This now completes the proof. 


5.2. Robinson’s Theory Q. We conclude this section by consider- 
ing a special and particularly simple arithmetical theory due originally to 
Robinson. Let £, be the language given by 0, S, +, - and =, and let Q be 
the theory determined by the axioms Eq,;, and 


(29) S(x) £0, 

(30) S(x) =S(y) > =y, 

(31) z+0=2, 

(32) a+ Sy) = S(#+y), 

(33) z-0=0, 

(34) x: S(y)=a-y+a, 

(35) dz(2#+S(z)=y)V2=yV az(yt+ S(z) =2). 


THEOREM. Every theory T > Q fulfills the assumptions of the theorem 
of Gédel-Rosser in 4.8, w.r.t. the definition L(a,y) := 4z(x + S(z) = y) of 
the <-relation. Moreover, every recursive function is representable in T. 


PRoor. We show that T with N(x) := (x = a) and L(a,y) := dz(a@+ 
S(z) = y) satisfies the conditions of the theorem in 5.1. For (21) and (22) 
this is clear. For (23) we can take a+ y = z and x-y = z as representing 
formulas. For (24) we have to show dz (x + S(z) = 0); this follows from 
(32) and (29). For the proof of (25) we need the auxiliary proposition 


(36) x =O0Viy(«# =0+4+S(y)), 


which will be attended to below. So assume x + S(z) = S(b), hence also 
S(a+z) = S(b) and therefore + z = b. We now use (36) for z. In case z = 0 
we obtain x = b, and in case dy (z = 0+ S(y)) we have 5y’ (a + S(y’) = d), 
since 0+ S(y) = $(0+ y). Thus (25) is proved. (26) follows immediately 
from (35). 
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For the proof of (36) we use (35) with y = 0. It clearly suffices to exclude 
the first case dz (x + S(z) = 0). But this means S(# + z) = 0, contradicting 
(29). 


COROLLARY (Essential Undecidability of Q). Every consistent theory 
T > Q is non-recursive. 


ProoF. By the theorems in 5.2 and 4.1. O 


5.3. Undecidability of First Order Logic. As a simple corollary to 
the (essential) undecidability of Q we even obtain the undecidability of pure 
logic. 

COROLLARY (Undecidability of First Order Logic). The set of formulas 
derivable in classical first order logic is non-recursive. 


PrRooF. Otherwise Q would be recursive, because a formula A is deriv- 
able in Q if and only if the implication B — A is derivable in classical first 
order logic, where B is the conjunction of the finitely many axioms and 
equality axioms of Q. 


REMARK. Notice that it suffices that the first order logic should have 
one binary relation symbol (for =), one constant symbol (for 0), one unary 
function symbol (for S) and two binary functions symbols (for + and -). The 
study of decidable fragments of first order logic is one of the oldest research 
areas of Mathematical Logic. For more information see Borger, Gradel and 
Gurevich [3]. 


5.4. Representability by %1-formulas of the language L£;. By 
reading through the above proof of representability, one sees easily that the 
representing formulas used are of a restricted form, having no unbounded 
universal quantifiers and therefore defining ©9-relations. This will be of cru- 
cial importance for our proof of Gédel’s Second Incompleteness Theorem to 
follow, but in addition we need to make a syntactically precise definition of 
the class of formulas actually involved. 


DEFINITION. The 44-formulas of the language £1 are those generated 
inductively by the following clauses: 


e Only atomic formulas of the restricted forms x = y, 7 Ay, 0= 2, 
S(a) =y, e+ y=2 and x-y = 2 are allowed as })-formulas. 

e If A and B are %-formulas, then so are AA B and AV B. 

e If A is a %y-formula, then so is Va<y A, which is an abbreviation 
for Va.dz (a + S(z) = y) — A. 

e If Ais a /-formula, then so is dx A. 


COROLLARY. Every recursive function is representable in Q by a X4- 
formula in the language Ly. 


PrRooF. This can be seen immediately by inspecting the proof of the 
theorem in 5.1. Only notice that because of the equality axioms Jz (a + 
S(z) = y) is equivalent to dzdw (S(z) = wAxr+w = y) and A(0) is equivalent 
to drz.0=a2AA. 
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6. Unprovability of Consistency 


We have seen in the Gédel-Rosser Theorem how, for every axiomatized 
consistent theory T safisfying certain weak assumptions, we can construct 
an undecidable sentence A meaning “For every proof of me there is a shorter 
proof of my negation”. Because A is unprovable, it is clearly true. 

Gédel’s Second Incompleteness Theorem provides a particularly inter- 
esting alternative to A, namely a formula Cony expressing the consistency 
of T. Again it turns out to be unprovable and therefore true. 

We shall prove this theorem in a sharpened form due to Lob. 


6.1. %;-Completeness of Q. We begin with an auxiliary proposition, 
expressing the completeness of Q with respect to /4-formulas. 

LEMMA. Let A(x1,...,%n) be a 4y-formulas in the language £, deter- 
mined by 0, S, +, - und =. Assume that Ny - Alay,...,an] where Ny is 
the standard model of £1. Then Q+ A(ai,...,Gn). 


PRooF. By induction on the };-formulas of the language £1. For atomic 
formulas, the cases have been dealt with either in the earlier parts of the 
proof of the theorem in 5.1, or (for x+y = z and x-y = 2z) they follow from 
the recursion equations (31) - (34). 

Cases AA B, AV B. The claim follows immediately from the induction 
hypothesis. 

Case Vr<y A(x, y, 21,---,2n); for simplicity assume n = 1. Suppose 
Ni - (Va<y A)[b, c]. Then also Ny — Ali, b,c] for each i < b and hence by 
induction hypothesis Q + A(i, b,c). Now by the theorem in 5.2 


QEVa<ba =0V---Va=b—1, 


hence 
Qh (Ve<y A)(b,6). 
Case 3x A(x, y1,.--,Yn); for simplicity take n = 1. Assume Ny — 
dx A)[b]. Then V4 —- Alfa, b] for some a € N, hence by induction hypothesis 
Qt A(a, b) and therefore Q + 3x A(z, 0). 


——~ 
l 


6.2. Formalized ©|-Completeness. 


LEMMA. In an appropriate theory T of arithmetic with induction, we 
can formally prove for any %1-formula A 


A(#) = SpPrr(p," A(z)?). 
Here Prfr(p,z) is a suitable %y-formula which represents in Robinson’s Q 
the recursive relation “a is the Godel number of a proof in T of the formula 


with Godel number 6”. Also "A(&)? is a term which represents, in Q, the 
numerical function mapping a number a to the Godel number of A(a). 


PROOF. We have not been precise about the theory T in which this 
result is to be formalized, but we shall content ourselves at this stage with 
merely pointing out, as we proceed, the basic properties that are required. 
Essentially 7 will be an extension of Q, together with induction formalized 
by the axiom schema 


B(0) A (Vz.B(x) > B(S(x))) > Va B(a) 
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and it will be assumed that T has sufficiently many basic functions available 
to deal with the construction of appropriate Godel numbers. 

The proof goes by induction on the build-up of the ©-formula A(z). 

We consider three atomic cases, leaving the others to the reader. Suppose 
A(a) is the formula 0 = x. We show T+ 0 = x > JpPrfr(p,"0 = «"), by 
induction on x. The base case merely requires the construction of a numeral 
representing the Géddel number of the axiom 0 = 0, and the induction step is 
trivial because T+ S(x) #0. Secondly suppose A is the formula x + y = z. 
We show TF Vz.a2 + y = z > SpPrfr(p,"% + y = 2") by induction on y. If 
y = 0, the assumption gives x = z and one requires only the Géddel number 
for the axiom Va(a + 0 = x) which, when applied to the Gédel number of 
the #-th numeral, gives dpPrfr(p,°% +0 = 27). If y is a successor S(w), 
then the assumption gives z = S(v) where x + u = v, so by the induction 
hypothesis we already have a p such that Prfr(p,"@ + tu = v'). Applying 
the successor to both sides, one then easily obtains from p a p’ such that 
Prfr(p',"& +y = <'). Thirdly suppose A is the formula « 4 y. We show 
Tt Vy.a # y > JpPrfr(p,"« # y") by induction on x. The base case 
x = 0 requires a subinduction on y. If y = 0, then the claim is trivial (by 
ex-falso). If y = S(u), we have to produce a Gédel number p such that 
Prfr(p,"0 4 S(w"), but this is just an axiom. Now consider the step case 
x = S(v). Again we need an auxiliary induction on y. Its base case is dealt 
with exactly as before, and when y = S(w) it uses the induction hypothesis 
for v £ u together with the injectivity of the successor. 

The cases where A is built up by conjunction or disjunction are rather 
trivial. One only requires, for example in the conjunction case, a function 
which combines the Gédel numbers of the proofs of the separate conjuncts 
into a single Gédel number of a proof of the conjunction A itself. 

Now consider the case 4yA(y,x) (with just one parameter x for sim- 
plicity). By the induction hypothesis we already have T + A(y,x) —- 
SpPrfr(p," A(y, £)"). But any Gédel number p such that Prfr(p," A(y, «)") 
can easily be transformed (by formally applying the J-rule) into a Gédel 
number p’ such that Prfr(p’,"dyA(y, ) '). Therefore we obtain as required, 
TF AyA(y, 2) > Ap'Prfr(p', “AyA(y, £)"). 

Finally suppose the }4-formula is of the form Vu<y A(u, x). We must 
show 


Vu<y A(u, x) > SpPrfp(p, "Vu<y A(u, #)"). 
By the induction hypothesis 
TE A(u, x) > SpPrfr(p," A(u, &) 7) 


so by logic 


Th Vu<y A(u, 2) = Vu<yspPrfr(p," A(u, «)"). 
The required result now follows immediately from the auxiliary lemma: 
TE Vu<yspPrfr(p," A(u, £)") > dgqPrfr(q,"Vu<y A(u, £)1). 
It remains only to prove this, which we do by induction on y (inside T). In 
case y = 0 a proof of u < 0 — A is trivial, by ex-falso, so the required Gédel 


number q is easily constructed. For the step case y = S(z) by assumption 
we have Vu<zipPrfr(p," A(u, £)"), hence SqPrfr(q,"Vu<z A(u, “)") by TH. 
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Also dp’Prfr(p’," A(z, “)"). Now we only have to combine p’ and gq to obtain 
(by means of an appropriate “simple” function) a Gédel number q’ so that 
Prfr(q’, Vu<y A(u, &) "). 


6.3. Derivability Conditions. So now let T be an axiomatized con- 
sistent theory with T > Q, and possessing “enough” induction to formalize 
%1-completeness as we have just done. Define, from the associated formula 
Prf, the following £,-formulas: 


Thmrp(x) := Sy Prfr(y, 2), 
Conr = ay Prfr(y, LL"). 


So Thmr(a) defines in Vy the set of formulas provable in T, and we have 
1 & Conr if and only if T is consistent. For £,-formulas A let DA := 

Thmy("A). 

Now consider the following two derivability conditions for T (Hilbert- 

Bernays [12]) 


(37) THA—-DA (A closed %1-formula of the language £1), 
(38) TEO(A- B) —-OA—OB. 


(37) is just a special case of formalized -completeness for closed formulas, 
and (38) requires only that the theory T has a term that constructs, from 
the Gédel number of a proof of A — B and the Godel number of a proof 
of A, the Géddel number of a proof of B, and furthermore this fact must be 
provable in 7. 

THEOREM (Gédel’s Second Incompleteness Theorem). Let T be an az- 
iomatized consistent extension of Q, satisfying the derivability conditions 
(37) und (38). Then T  Conr. 


PrRooF. Let C' := 1 in the theorem below, which is L6b’s generalization 
of Gédel’s original proof. 


THEOREM (Lob). Let T be an axiomatized consistent extension of Q 
satisfying the derivability conditions (37) and (38). Then for any closed £1- 
formula C, if T+ OC > C (that is, T+ Thmr(“C7) — C), then already 
TEC. 


Proor. Assume T' + OC — C. Choose A by the Fixed Point Lemma 
in 3.2 such that 


(39) QrAe< (GAC). 
We must show T+ C. First we show T+ DA — C, as follows. 
TKA-OA-C by (39) 


THO(A—OA-—C) _ by %-completeness 
TFOA—>O(0OA—C) by (838) 

TtOA—>OOA—CC again by (38) 

TrFOA->OC because T+ DA — ODA by (387). 


Therefore from the assumption T + IC — C we obtain T+ OAC. 
This implies T+ A by (39), and then T+ OA by Yj-completeness. But 
Tt OA -— C as we have just shown, therefore T- C. 
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REMARK. It follows immediately that if T is any axiomatized consistent 
extension of Q satisfying the derivability conditions (37) und (38), then the 
reflection scheme 


Thmz("C!) > C for closed £1-formulas C 


is not derivable in T. For by Lob’s Theorem, it cannot be derivable when 
C is underivable. 

By adding to Q the induction scheme for all formulas we obtain Peano- 
arithmetic PA, which is the most natural example of a theory T to which 
the results above apply. However, various weaker fragments of PA, obtained 
by restricting the classes of induction formulas, would serve equally well as 
examples of such T. 


7. Notes 


The undecidability of first order logic has first been proved by Church; 
however, the basic idea of the proof was present in Gédels [11] already 

The fundamental papers on incompleteness are Gédel’s [10] (from 1930) 
and [11] (from 1931). Gédel also discovered the 6-function, which is of cen- 
tral importance for the representation theorem; he made use of the Fixed 
Point Lemma only implicitely. His first Incompleteness Theorem is based 
on the formula “I am not provable”, a fixed point of ~Thmy(x). For the 
independence of this proposition from the underlying theory T he had to 
assume w-consistency of T. Rosser (1936) proved the sharper result repro- 
duced here, using a formula with the meaning “for every proof of me there 
is a shorter proof of my negation”. The undefinability of the notion of truth 
has first been proved by Tarski (1939). The arithmetical theories R und Qo 
(in Exercises 46 and 47) are due to R. Robinson (1950). R is essentially un- 
decidable, incomplete and strong enough for %j-completeness; moreover, all 
recursive relations are representable in R. Qo is a very natural theory and 
in contrast to R finite. Qo is minimal in the following sense: if one axiom is 
deleted, then the resulting theory is not essentially undecidable any more. 
The first essentially undecidable theory was found by Mostowski and Tarski 
(1939); when readfing the manuscript, J. Robinson had the idea of treating 
recursive functions without the scheme of primitive recursion. 

Important examples for undecidable theories are (in historic order): 
Arithmetic of natural numbers (Rosser, 1936), arithmetic of integers (Tarski, 
Mostowski, 1949), arithmetic of rationals and the theory of ordered fields 
(J. Robinson 1949), group theory and lattice theory (Tarski 1949). This 
is in contrast to the following decidable theories: the theory of addition 
for natural numbers (Pressburger 1929), that of multiplication (Mostowski 
1952), the theory of abelian groups (Szmielew 1949), of algebraically closed 
fields and of boolean algebras (Tarski 1949), the theory of linearly ordered 
sets (Ehrenfeucht, 1959). 


CHAPTER 5 


Set Theory 


1. Cumulative Type Structures 


Set theory can be viewed as a framework within which mathematics can 
be given a foundation. Here we want to develop set theory as a formal theory 
within mathematical logic. But first it is necessary to have an intuitive 
picture of the notion of a set, to be described by the axioms. 


1.1. Cantor’s Definition. Cantor in 1895 gave the following defini- 
tion: 
Unter einer “Menge” verstehen wir jede Zusammenfassung 
M von bestimmten wohlunterschiedenen Objekten m un- 
serer Anschauung oder unseres Denkens (welche die Ele- 
mente von M genannt werden) zu einem Ganzen. 


One can try to make this definition more precise, as follows. Let V be the 
collection of all objects “unserer Anschauung oder unseres Denkens”. Let 
A(x) denote properties of objects « from V. Then one can form the set 
{x | A(x) }, the set of all objects x of V with the property A(x). According 
to Cantor’s definition { «| A(a) } is again an object in V. 

Examples for properties: (1) x is a natural number. (2) x is a set. (3) x 
is a point, y is a line and z lies on y. (4) y is a set and z is an element of y, 
shortly: Set(y) A x € y. 

However, Cantor’s definition cannot be accepted in its original form, for 
it leads to contradictions. The most well known is Russell’s antinomy: Let 
ro := {x | Set(z) \x¢ a}. Then 


Lo € Xo + Set(xp) Axo ¢ Lo — Xo ¢F Xo, 


for xo is a set. 


1.2. Shoenfield’s Principle. The root for this contradiction is the 
fact that in Cantor’s definition we accept the concept of a finished totality 
of all sets. However, this is neither necessary nor does it mirror the usual 
practice of mathematics. It completely suffices to form a set only if all its 
elements “are available” already. This leads to the concept of a stepwise 
construction of sets, or more precisely to the cumulative type structure: We 
start with certain “urelements”, that form the sets of level 0. Then on an 
arbitrary level we can form all sets whose elements belong to earlier levels. 

If for instance we take as urelements the natural numbers, then {27, {5}} 
belongs to level 2. 

The following natural questions pose themselves: (1) Which urelements 
should we choose? (2) How far do the levels reach? 
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Ad (1). For the purposes of mathematics it is completely sufficient not 
to assume any urelements at all; then one speaks of pure sets. This will be 
done in the following. 

Level 0: — 

Level 1: @ 

Level 2: 0, {} 

Level 3: 0, {0}, {{O}}, (0, {0}} 


and so on. 


Ad (2). In [23], Shoenfield formulated the following principle: 


Consider a collection S of levels. If a situation can be 
conceived where all the levels from S are constructed, then 
there exists a level which is past all those levels. 


From this admittedly rather vage principle we shall draw exact consequences, 
which will be fixed as axioms. 

By a set we intuitively understand an object that belongs to some level 
of the cumulative type structure. By a class we mean an arbitrary collection 
of sets. 

So every set clearly is a class. Moreover there are classes that are not 
sets, for instance the class V of all sets. 


2. Axiomatic Set Theory 


In set theory — as in any axiomatic theory — we have to explicitely state 
all used properties, including the “obvious” ones. 


2.1. Extensionality, Equality. The language of set theory has a sin- 
gle non-logical symbol, the element relation €. So the only atomic formulas 
are of the form x € y (x is an element of y). Equality x = y is defined by 

PSYiHVALEUVOZEY. 
To ensure compatibility of the €-relation with equality we need an axiom: 

AXIOM (Extensionality). 


ZB=Yrrwezyez. 


REMARK. If alternatively equality is to be used as a primitive symbol, 
one must require the equality axioms and in addition 


(Vz.zE€nozey)rr=y. 


As classes in our axiomatic theory we only allow definable collections of 
sets. By “definable” we mean definable by a formula in the language of set 
theory. More precisely: If A(a) is a formula, then 


{a | A(x) } 
denotes the class of all sets x with the property A(z). 
Instead of classes we could have used properties or more precisely for- 
mulas as well. However, classes allow for a simpler and more suggestive 
formulation of many of the propositions we want to consider. 
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If A(x) is the formula x = x, then {x | A(x) } is called the all class or 
the (set theoretic) universe. If A(x) is the formula x ¢ x, then {x | A(x) } 
is called the Russell class. 

We now give some definitions that will be used all over in the following. 
A set b is an element of the class { a | A(x) } if A(d) holds: 


be {x | A(x) } := Ad). 
Two classes A, B are equal if they have the same elements: 
A=B:=Va.rEAcxreB. 


If A is a class and 6 a set, then A and 0 are called equal if they have the 
same elements: 

A=b:=Va.rE Axed. 
In this case we identify the class A with this set b. Instead of “A ist set” we 
also write A € V. A class B is an element of a set a (of a class A, resp.) if 
B is equal to an element x of a (of A, resp.). 


Bea:= Arc eca\B=za, 
BeA:=Arcxc Ee AANB=x. 
A class A is a proper class if A is not a set: 
A proper class := Vr(a # A). 
REMARK. Every set b is a class, since 
b= (awed). 


The Russell class is a proper class, for if {x | « ¢ x} = x0, we would have 


Lo E Xp Xo E Xo. 


So the Russell construction is not an antinomy any more, but simply says 
that there are sets and (proper) classes. 


Let A, B be classes (proper classes or sets) and a,b,a1,...,@n sets. We 

define 

TOO Se |e ay VY eve Sa 

(ate | aes} empty class, 

Viet el eS all class, 

ACB:=Va.xrEA—xeEB A is subclass of B, 

AGCB:=ACBAAFB A is proper subclass of B, 

ANB:={x|xrxeAAze B} intersection, 

AUB:={x|zrEeAVre B} union, 

A\B:={z|reAAz ¢ B} difference, 

JA:={2| yy eArrey} big union, 

()A ={a|VyyeAorey} big intersection, 

P(A) :={a|e% CA} power class of A. 


In particular a U b = U{a,b} and an b = (){a, b}, and ()0 is the all class. 
Moreover P(A) is the class of all subclasses of A that happen to be sets. 
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2.2. Pairs, Relations, Functions, Unions. Ordered pairs are de- 
fined by means of a little trick due to Kuratowski: 


(a,b) :={xa|a={a}Va={a,b}} (ordered) pair, 
so (a,b) = {{a}, {a, b}}. To make sure that (a,b) is not the empty class, we 
have to require axiomatically that {a} and {a,b} are sets: 

AXIOM (Pairing). 

{x,y} is a set. 

In the cumulative type structure the pairing axiom clearly holds, because 
for any two levels S; and S2 by the Shoenfield principle there must be a level 
S coming after S$; and So. 

Explicitely the pairing axiom is VaVydzVu.u€ zou=xVu=y. In 
particular it follows that for every set a the singleton class {a} is a set. It 


also follows that (a,b) = {{a}, {a, b}} is a set. 
Moreover we define 


{ (2, y) | A(x, y) } = {z | dr, y.A(x, y) NZ= (x,y) } 


and 

AxB:={(x,y)|ceAAyeB} cartesian product of A, B, 
dom(A) := {x | dy ((x,y) € A) } domain of A, 

rg(A) := {y | dx ((z,y) € A)} range of A, 

AIB := {(2,y) | (a, y)€e AArce Bh restriction of A to B, 
A[B] := {y | da.z € BA (x,y) € A} image of B under A, 

A} :={(y,2x) | (x,y) € A}, inverse of A, 


AoB:= { (a,z) | dy(z,y) € BA (y,z) € A} composition of A, B. 


Without any difficulty we can introduce the usual notions concerning 
relations and functions. For classes A, 6 and C we define 


(a) Aisa relation iff AC VxV. Hence a relation is a class of pairs. Instead 


of (a,b) € A we also write aAb. 
(b) Ais a relation on Biff AC Bx B. 


(c) A is a function iff A is a relation and 


Va,y,2(2@,y) € A (4,2) Ee Ao y=z. 
(d) A: BC iff A is a function such that dom(A) = 6 and A[B] CC. We 
then call A a function from B to C. 
(e) A: B onto C iff A: B — C and A[B] =C. We then call A a surjective 
function from B onto C. 
(f) A is injective iff A and A~! are functions. 
(g) A: BoC iff A: B onto C and A is injective. Then A is called bijective 
function from B onto C. 
For the further development of set theory more axioms are necessary, in 
particular 
AXIOM (Union). 
Us is a set. 
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The union axiom holds in the cumulative type structure. To see this, 
consider a level S where x is formed. An arbitrary element v € x then is 
available at an earlier level S, already. Similarly every element u € v is 
present at a level S$, before S,. But all these u make up Ux. Hence also 
Uz can be formed at level S. 

Explicitely the union axiom is VrdyVz.z € yo duuecxrAz €u. 

We now can extend the previous definition by 


A(z) := fy | (x,y) € A} application. 


If A is a function and (z,y) € A, then A(x) = U{y} = y and we write 
A: rr y. 


2.3. Separation, Power Set, Replacement Axioms. 
AXIOM (Separation). For every class A, 


ACa—dy(A=y). 


So the separation scheme says that every subclass A of a set x is a set. 
It is valid in the cumulative type structure, since on the same level where x 
is formed we can also form the set y, whose elements are just the elements 
of the class A. 

Notice that the separation scheme consists of infinitely many axioms. 


AXIOM (Power set). 


P(x) is a set. 

The power set axiom holds in the cumulative type structure. To see 
this, consider a level S where x is formed. Then also every subset y C x has 
been formed at level S. On the next level S” (which exists by the Shoenfield 
principle) we can form P(2). 

Explicitely the power set axiom is VadyVz.z €yo2#Ca. 

LEMMA 2.1. a x b is a set. 


PROOF. We show a x b C P(P(aUb)). So let x € a and y € b. Then 
{x}, {x,y} CaUb 
{x}, {x,y} € P(aUb) 
{ {x}, {x,y} } © P(aUd) 
(x,y) ={{2}, {x,y} } € P(P(aUd)) 


The claim now follows from the union axiom, the pairing axiom, the power 
set axiom and the separation scheme. 


AxI0oM (Replacement). For every class A, 
A is a function — Vx (Al[a] is a set). 


Also the replacement scheme holds in the cumulative type structure; 
however, this requires some more thought. Consider all elements u of the 
set xMdom(A). For every such u we know that A(w) is a set, hence is formed 
at a level S,, of the cumulative type structure. Because 1M dom(.A) is a set, 
we can imagine a situation where all S,, for u € x dom(A) are constructed. 
Hence by the Shoenfield principle there must be a level S coming after all 
these S,,. In S we can form Az]. 
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LEMMA 2.2. The replacement scheme implies the separation scheme. 


Proor. Let AC ¢ and B:= {(u,v)|u=vAuwe A}. Then Bisa 
function and we have B[z] = A. 


This does not yet conclude our list of axioms of set theory: later we will 
require the infinity axiom, the regularity axiom and the axiom of choice. 


3. Recursion, Induction, Ordinals 


We want to develop a general framework for recursive definitions and 
inductive proofs. Both will be done by means of so-called well-founded 
relations. To carry this through, we introduce as an auxiliary notion that of 
a transitively well-founded relation; later we will see that it is equivalent to 
the notion of a well-founded relation. We then define the natural numbers in 
the framework of set theory, and will obtain induction and recursion on the 
natural numbers as special cases of the corresponding general theorems for 
transitively well-founded relations. By recursion on natural numbers we can 
then define the transitive closure of a set, and by means of this notion we 
will be able to show that well-founded relations coincide with the transitively 
well-founded relations. 

Then we study particular well-founded relations. We first show that 
arbitrary classes together with the €-relation are up to isomorphism the 
only well-founded extensional relations (Isomorphy Theorem of Mostowski). 
Then we consider linear well-founded orderings, called well-orderings. Since 
they will always be extensional, they must be isomorphic to certain classes 
with the €-relation, which will be called ordinal classes. Ordinals can then 
be defined as those ordinal classes that happen to be sets. 


3.1. Recursion on Transitively Well-Founded Relations. Let <A, 
B,C denote classes. For an arbitrary relation R on A we define 
(a) 2® := {y| yRa} is the class of R-predecessors of x. We shall write @ 
instead of #*, if R is clear from the context. 
(b) BC Ais called R-transitive if 
Va.cE€E BorCeB. 


Hence 6 C A is R-transitive iff yRa and x € B imply y € B. 
(c) Let BC A. x € B is an R-minimal element of B if ZN B= 0. 
(d) R is a transitively well-founded relation on A if 
(i) Every nonempty subset of A has an R-minimal element, i.e. 


YVaaC Asaf arreahtna=O. 
(ii) For every x € A there is an R-transitive set b C A such that # C b. 
We shall almost everywhere omit R, if R is clear from the context. 


REMARK. Let FR be a relation on A. FR is a transitive relation on A if 
for all x,y,z € A 


tRy — yRz — &Rz. 


We have the following connection to the notion of ?-transitivity for classes: 
Let R be a relation on A. Then 


R is a transitive relation on A < for every y € A, g is R-transitive. 
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PRooF. —. Let R be a transitive relation on A, y € A and z € 4, 
hence Ry. We must show & C ¥y. So let zRa. We must show zRy. But 
this follows from the transitivity of R. —. Let z,y,z € A, reRy and yRz. 
We must show xRz. We have «Ry and y € 2. Since 2 is R-transitive, we 
obtain x € 2, hence +Rz. 


LEMMA 3.1. Let R be a transitively well-founded relation on A. Then 


(a) Every nonempty subclass B C A has an R-minimal element. 
(b) Va.xeA — & is a set. 


PRooF. (a). Let B C A and z € B. We may assume that z is not 
B-minimal, ie. 29 B #4. By part (ii) of the definition of transitively well- 
founded relations there exists an R-transitive superset b C A of Z. Because 
of 2B # 0 we have bOB # 0). By part (i) of the same definition there exists 
an R-minimal x € ON B, ie., ZNbNB =O. Since b is R-transitive, from 
x € b we obtain ¢ C b. Therefore 21 B = 0 and hence z is an R-minimal 
element of B. 

(b). This is a consequence of the separation scheme. 


3.2. Induction and Recursion Theorems. We write Vxe A... for 
Va.c€A—... and similarly dreA... for drxeAn.... 


THEOREM 3.2 (Induction Theorem). Let R be a transitively well-founded 
relation on A and B an arbitrary class. Then 


VtEeA.2 CB 2rEB 
implies A CB. 
Proor. Assume A\ B #4 §. Let x be a minimal element of A\ B. It 
suffices to show « C B, for then by assumption we obtain x € B, hence a 


contradiction. Let z € #. By the choice of x we have z ¢ A\B, hence z € B 
(because z € A holds, since R is a relation on A). 


THEOREM 3.3 (Recursion Theorem). Let R be a transitively well-founded 
relation on A and G: V — V. Then there exists exactly one function 
F:A—V such that 


Vac A(F (2) = G(F[2)). 
PRoorF. First observe that for F: A — V we have F[@ C & x F[al, 


hence F[% is a set. 
Uniqueness. Given F;, F2. Consider 


{z|2€ AAF\ (2) = Fo(x)} =: B. 
By the Induction Theorem it suffices to show VreE.A.4 C B > x € B. So let 
x€Aand#CB. Then 
Fi lt = Foltz 
G (Fil&) = G (F2l#) 
Fi (x) => Fo(x) 
ceB. 
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Existence. Let 
B:={f | f function, dom(f) R-transitive subset of A, 


Vaedom(f) (f(z) = G(fT#)) } 


Fa 8. 
We first show that 
f,g€B— x €dom(f) dom(g) > f(x) = g(a). 


and 


So let f,g € B. We prove the claim by induction on 2, i.e., by an application 
of the Induction Theorem to 


{a | « € dom(f) Mdom(g) — f(x) = g(x) }. 
So let x € dom(f) Ndom(g). Then 
& Cdom(f) dom(g), for dom(f),dom(g) are R-transitive 


fla = gle by IH 
G (fl) =G (gl@) 
f(x) = g(2). 


Therefore F is a function. Now this immediately implies f €¢ B — x € 
dom(f) — F(x) = f(x); hence we have shown 


(40) F(x) =G(Fl#) for all x € dom(F). 


We now show 
dom(F) = A. 
C is clear. D. Use the Induction Theorem. Let 4 C dom(F). We must show 
y € dom(F). This is proved indirectly; so assume y ¢ dom(F). Let b be 
R-transitive such that 7 C b C A. Define 


g = FOUL (y, G(FT9)) }- 


It clearly suffices to show g € B, for because of y € dom(g) this implies 
y € dom(F) and hence the desired contradiction. 

g is a function: This is clear, since y ¢ dom(F) by assumption. 

dom(g) is R-transitive: We have dom(g) = (bM dom(F)) U {y}. First 
notice that dom(F) as a union of R-transitive sets is R-transitive itself. 
Moreover, since b is R-transitive, also bM dom(F) is R-transitive. Now let 
zRx and « € dom(g). We must show z € dom(g). In case x € bM dom(F) 
also z € bM dom(F) (since bM dom(F) is R-transitive, as we just observed), 
hence z € dom(g). In case x = y we have z € #j, hence z € b and z € dom(F) 
be the choice of 6 and y, hence again z € dom(q). 

Vaedom(g) (g(x) = G(gl#)): In case x € bM dom(F) we have 


g(x) = F(a) 
=G(F|£) by (40) 
=G(glz) since ¢ C bNdom(F), for bN dom(F) is R-transitive. 


In case x = y is g(x) = G(F[&) = G(gl#), for = g C bNdom(F) by the 
choice of y. 
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3.3. Natural Numbers. Zermelo defined the natural numbers within 
set theory as follows: 0 = 0, 1 = {0}, 2 = {{O}}, 3 = {{{O}}} and so 
on. A disadvantage of this definition is that it cannot be generalized to the 
transfinite. Later, John von Neumann proposed to represent the number n 
by a certain set consisting of exactly n elements, namely 

n:={0,1,...,n—-1}. 
So0=@andn+1={0,1,...,n}={0,1,...,n—1}U{n}. Generally we 
define 
C=O et ieSeu tet. 
In particular, 1 :=0+1, 2:=1+1,3:=2+41 and so on. 

In order to know that the class of all natural numbers constructed in 
this way is a set, we need another axiom: 

AXIOM (Infinity). 


cWexAVyyeu—oyU {yf ea. 


The Infinity Axiom holds in the cumulative type structure. To see this, 
observe that 0 = 0, 1:=@U {0}, 2 := 1U {1} and so on are formed at levels 
So, S1, Sg ..., and we can conceive a situation where all these levels are 
completed. By the Shoenfield principle there must be a level - call it S,, - 
which is past all these levels. At S,, we can form w. 

We call a class A inductive if 


DEAAVyY YEAR YU {y} EA. 
So the infinity axiom says that there is an inductive set. Define 
ot (\ x | x is inductive }. 


Clearly w is a set, with the properties 0 € w andye€w—yt+1lew. wis 
called the set of natural numbers. 
Let n, m denote natural numbers. Vn A(n) is short for Vr.c € w > A(x), 
similarly dn A(n) for dr.x € wA A(x) and {n| A(n)} for {x | € wAA(z) }. 
THEOREM 3.4 (Induction on w). 


(a) ceCw-0Er—(Vnneront+ler)-2=w. 
(b) For every formula A(x), 


A(0) — (Wn. A(n) > A(n + 1)) — VYnA(n). 
PROOF. (a). « is inductive, hence w C x. (b). Let A:= {n | A(n) }. 
Then A C w (so A is set), and by assumption 


0E A, 
nEeAwant+1eEd. 


By (a), A=w. 


We now show that for natural numbers the relation € has all the prop- 
erties of <, and the relation C all the properties of <. 

A class A is called transitive if it is E-transitive w.r.t. the special relation 
E:={(a,y)|a2ey}onV,ie, if Vaca eA «CA. Therefore A is 
transitive iff 

yExreEA—yEeEA 
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LEMMA 3.5. (a) n is transitive. 
(b) w is transitive. 


PRooF. (a). Induction by n. 0 is transitive. n — n+1. By IH, n is 
transitive. We must show that n+ 1 is transitive. We argue as follows: 
yexrentl 
yexrenU{n} 
yerenVyex=an 
yenvyen 
yenU{n}=n4+1. 
(b). We show Vz.2 € n > x € w, by induction on n. 0: Clear. n > n+1. 


By IH we have V4.2 €n — & Ew. Soassume x € n+1. Thenz EnVae=n, 
hence x € w. 


LEMMA 3.6. n ¢ n. 
PROOF. Induction on n. 0. Clear. n + n+1: By IH isn ¢ n. Assume 


nt+1len4+l1 
n+lenVn4+l=n 


nentlenvnent+l=n 
nen for n is transitive by Lemma 3.5. 


This is a contradiction to the IH. 


LEMMA 3.7. (a) nCm+leonCmVn=m+1. 
(b)nGmonemVn=m. 
(c) nC mMVmMCn. 
(d) nEemMVn=MVME nN. 


PRooF. (a). < follows from mC m-+1. —. Assume n Cm+1. Case 
m én. We show n=m-+1. C holds by assumption. D. 
pem+til 
pEemVp=m 
pen. 
Case m ¢ n. We show n C m. 
pen 
pemt+l 
pEemVp=™M, 
but p = m is impossible because of m ¢ n. 


(b). <— follows from transitivity of m. —. Induction on m. 0. Clear. 
m—oam+i. 


ncm+l1 

nomVn=m+1 by (a) 
nemVn=mVn=m+1 by IH 
nemt+1Vn=m-+1. 
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(c). Induction on n. 0. Clear. n ~ n+ 1: Case m Cn. Clear. Case 
n Cm. Then 


nemVn=m by (b) 
n,{n} CmVmCn+1 
n+1oCmVmCn4+l1 

(d). Follows from (c) and (b). 


THEOREM 3.8 (Peano-Axioms). (a) n+14 9. 
(b) n+1=m+1l—on=™m. 
(c) t¢Cw-0E€xo(Vnnexont+lenr)or=w. 
PROOF. (a). Clear. (c). This is Theorem 3.4(a). (b). 
n+l=m+1 


nem+lAmen t 1 


(nemAmen)Vn=m 
nenVn=m 


n=m. 


This concludes the proof. 


We now treat different forms of induction. 
THEOREM 3.9 (Course-of-values induction on w). 


(a) rCw o> Wn.Vnmenomer)onearl]or=w. 
(b) Vn.(¥m.m €n— A(m)) > A(n)] — VnA(n). 


PRooF. (b). Assume Yn.(Vm.m € n > A(m)) — A(n); we shall say 
in this case that A(n) is progressive. We show Vm.m € n — A(m), by 
induction on n. 0. Clear. n > n+1. By IH Vm.m € n > A(m). So let 
meéen+il. ThenmenVm=n. In case m € n we obtain A(m) by IH, 
and in case m = n we can infer A(n) from the progressiveness of A, using 
the IH. 

(a). From (b), with A(y) :=y € z. 


THEOREM 3.10 (Principle of least element for w). 
(a) PDAx Cw IAnnEeErAnnac=O. 
(b) Sn A(n) > Sn. A(n) A =Sdm.m € nA A(m). 
ProoF. (b). By Theorem 3.9(b) 

[\Vn.(Vm.m € n .A(m)) 1A(n)| = Vn-7A(n). 
Contraposition gives 
nA(n) > 
nA(n) > 
(a). From (b), using A(y) := y € @. 


n.A(n) \Vm.m € n > 7A(m) 
n.A(n) Andm.m € nA A(m). 


We now consider recursion on natural numbers, which can be treated as 
a special case of the Recursion Theorem 3.3. To this end, we identify € with 
the relation € = { (x,y) | x € y} and prove the following lemma: 


LEMMA 3.11. € M(w x w) is a transitively well-founded relation on w. 
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PROOF. We show both conditions, from the definition of transitively 
well-founded relations. (i). Let @ #4 a C w. We must show dn.n € aAnna = 
(. But this is the above principle of the least element. (ii). Clear, since n is 
transitive. 


THEOREM 3.12 (Course-of-values recursion on w). Let G: V + V. Then 
there is exactly one function f: w— V such that 


Vn (f(n) = G(fIn)). 


ProoF. By the Recursion Theorem 3.3 there is a unique F:w -— V 
such that Vn (F(n) = G(FIn)). By Replacement, rng(F) = F[w] is a set. 
By Lemma 2.1 and Separation, also F C w x Fw] is a set. 


COROLLARY 3.13. Let G: V — V anda be a set. Then there is exactly 
one function f: w— V such that 


f(0) =a, 
Vn (f(n + 1) = G(F(n)))- 
PRooF. First observe that U(n + 1) = n, because of 


cé|J(n+1); dyceyent+l 


< 


mazementil 


odmnxemen 
OUrEN. 

For the given G we will construct G’ such that G/(f[n +1) = G(f(n)). We 
define a function G’: V — V satisfying 

g'(x) = cen ite #0 

a, if2 =Q, 
by 
GF ={(2,y) | (@ £0 y= Glel Jdom(2))) A (0 =O y =a)}. 


Then there is a unique function f: w — V such that 


f(n+ I =G'(fIn+1) 


=G((fint )((J(m+)) 
ee 
=G(f(n)), 
f(0) = G'(F10) 
@ 


This concludes the proof. 


We now define 
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By Corollary 3.13 for every m there is such a function, and it is uniquely 
determined. We define 
m+n := 8m(n). 

Because of $m(1) = $m(0+1) = 8m(0) +1 = m-+1, for n = 1 this definition 
is compatible with the previous terminology. Moreover, we have m+0 =m 
and m+(n+1)=(m+n)+1. 

LEMMA 3.14. (a) m+n €w. 
(b) (m+n) +p=m-+ (n+p). 
(c) m+n=n+4+™m. 


PRooF. (a). Induction on n. 0. Clear. n n+l.m+(n4+1)= 
(m+n)+1, and by IH m+néw. 
(b). Induction on p. 0. Clear. p> p+1. 
(m+n) +(pt+1) = [(m+n) +p] + 
= [m+ (n+p) 
=m+|(n+p)4+1 
=m-+ |n+(p+1)]. 


by IH 


(c). We first prove two auxiliary propositions. 
(i) 0+n =n. The proof is by induction on n. 0. Clear. n > n+ 1. 
O0+(n+1)=(+n)+1=n+1. 
(ii) (m+1)+n= (m+n) +1. Again the proof is by induction on n. 0. 
Clear. n> n+1. 
(m+1)+(n4+1)=[(m+1)4+n]+1 
=[(m+n)+1] +1. by IH 
=[m4+(n+1] +1. 
Now the claim m+n =n-+m con be proved by induction on m. 0. By 
(i). Step m > m+1. 
(m+1)+n=(m+n)4+1 by (ii) 
=(n+m)+1_ by IH 
=n+(m+1). 


This concludes the proof 
We define 


Pm(0) =0,  Pm(n+1) = pm(n) +m. 
By Corollary 3.13 for every m there is a unique such function. Here we need 


G:V—-YV, 
rtm, if €w; 
Ae ‘‘ otherwise. 


We finally define m:n := pm(n). Observe that this implies m-0 = 0, 
m:-(n+1)=m-n+m. 
LEMMA 3.15. (a) m-n€w. 
(b) m-(n+p)=m-n+m-p. 
(c) (n+ p):-m=n-m+p-m. 
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(d) (m-n)-p=m-(n-p). 
(e) 0-n=0,1-n=n,m-n=n-m. 


PROOF. Exercise. 


REMARK. n™, m—n can be treated similarly; later (when we deal 
with ordinal arithmetic) this will be done more generally. - We could now 
introduce the integers, rationals, reals and complex numbers in the well- 
known way, and prove their elementary properties. 


3.4. Transitive Closure. We define the ?-transitive closure of a set a, 
w.r.t. arelation R with the property that the #-predecessors of an arbitrary 
element of its domain form a set. 


THEOREM 3.16. Let R be a relation on A such that &* (:= {y | yRa}) 
is a set, for every x € A. Then for every subset a C A there is a uniquely 
determined set b such that 


(a) aCbC A; 

(b) 6 is R-transitive; 

(c) Vea CoC Ac R-transitive > b Cc. 
b is called the R-transitive closure of a. 


PrRooF. Uniqueness. Clear by (c). Existence. We shall define f: w — V 
by recursion on w, such that 


f(0) =a, 
f(n+1)={y| sref(n)(yRa) }. 

In order to apply the Recursion Theorem for w, we must define f(m + 1) in 
the form G(f(n)). To this end choose G: V > V, z+ Urng(H}z) such that 
H:V —-V,x2+ &; by assumption H is a function. Then 

y €G(f(n)) oy €mg(Aif(n)) 

o dz.z € mg(H[{f(n)) Ay € z 

z,0.c€ f(n)\Az=fAYEz 
vx € f(n)AyRe. 


<=. 


t 


By induction on n one can see easily that f(n) is a set. For 0 this is clear, 
and in the step n > n +1 it follows - using f(n +1) = U{#| ae f(n)} 
- from the IH, Replacement and the Union Axiom. — We now define b := 


Urng(f) =Uf{ f(r) | n ew}. Then 
(a). a= f(0) CbCA. 


(b). 
yR« Eb 
yRz € f(n) 
y€ f(n+1) 
yedb. 
(c). Let a C c C A and c be R-transitive. We show f(n) C c by 


induction onn. 0.aCcn—ont+l. 


y€ f(n+1) 
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yRex € f(n) 
YREEC 
yee. 


This concludes the proof. 


In the special case of the element relation € on V, the condition Va(# = 
{y|y€x} isa set) clearly holds. Hence for every set a there is a uniquely 
determined €-transitive closure of a. It is called the transitive closure of a. 

By means of the notion of the R-transitive closure we can now show that 
the transitively well-founded relations on A coincide with the well-founded 
relations on A. 

Let R be a relation on A. FR is a well-founded relation on A if 


(a) Every nonempty subset of A has an R-minimal element, i.e., 


VaacCA>a#0 > Area.éina=9; 


(b) for every x € A, @ is a set. 


THEOREM 3.17. The transitively well-founded relations on A are the 
same as the well-founded relations on A. 


PrRooF. Every transitively well-founded relation on A is well-founded by 
Lemma 3.1(b). Conversely, every well-founded relation on A is transitively 
well-founded, since for every x € A, the #R-transitive closure of % is an 
R-transitive b C A such that & C b. 


Therefore, the Induction Theorem 3.2 and the Recursion Theorem 3.3 
also hold for well-founded relations. Moreover, by Lemma 3.1(a), every 
nonempty subclass of a well-founded relation R has an R-minimal element. 

Later we will require the so-called Regularity Axiom, which says that 
the relation € on V is well-founded, i.e., 


Vaa40—- Irearna= 9. 


This will provide us with an important example of a well-founded relation. 

We now consider extensional well-founded relations. From the Regular- 
ity Axiom it will follow that the €-relation on an arbitrary class A is a well- 
founded extensional relation. Here we show - even without the Regularity 
Axiom - the converse, namely that every well-founded extensional relation 
is isomorphic to the €-relation on a transitive class. This is Mostowski’s 
Isomorphy Theorem. Then we consider linear well-founded orderings, well- 
orderings for short. They are always extensional, and hence isomorphic to 
the €-relation on certain classes, which will be called ordinal classes. Ordi- 
nals will then be defined as ordinal sets. 

A relation R on A is extensional if for all x,y € A 


(VzEA. 2Ra oO ZRy) > r=y. 


For example, for a transitive class A the relation € M(A x A) is extensional 
on A. This can be seen as follows. Let x,y € A. For R :=e€ (A x A) we 
have zRa  z € g, since A is transitive. We obtain 

VzEA. cRe oO zRy 


VzzEnozey 
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L=y 
From the Regularity Axiom it will follows that all these relations are 


well-founded. But even without the Regularity Axiom these relations have 
a distinguished meaning; cf. Corollary 3.19. 

THEOREM 3.18 (Isomorphy Theorem of Mostowski). Let R be a well- 
founded extensional relation on A. Then there is a unique isomorphism F 
of A onto a transitive class B, i.e. 


3-1 F.F: A mg(F) Amg(F) transitive \Vx,yeAyRa o Fly) € F(a). 


PROOF. Existence. We define by the Recursion Theorem 
F:A-V, 
F(x) = rng(F la) (={F(y) | yRe }). 


F injective: We show Vz, yEA.F (x) = Fly) — x = y by R-induction 
on x. So let x,y € A be given such that F(x) = F(y). By IH 


VzE A ZRe > WueEA F(z) = Flu) — 2 =u. 
It suffices to show zRa — zRy, for all ze A. >. 


F(z) € F(x) = Fly) = {F(u) | uRy } 
F(z) = Flu) for some uRy 
z=U by IH, since zRa 


F(z) € Fly) = F(x) = {F(u) | uRx} 
F(z) = F(u) for some uRx 
zZ=U by IH, since uRx 


rg(F) is transitive: Assume u € v € rng(F). Then v = F(x) for some 
x € A, hence u = F(y) for some yRa. 

yR« — Fly) € F(a): —. Assume yRa. Then F(y) € F(x) by definition 
of F. <—. 


Fly) € F(x) = { F(z) | <Ra} 
Fa) =F) for some zRx 
ye since F is injective 
YR«2. 
Uniqueness. Let F; (i = 1,2) be two isomorphisms as described in 


the theorem. We show VreA (Fi(a) = Fo(x)), by R-induction on x. By 
symmetry it suffices to prove u € Fi(x) > u € Fo(z). 


ué Fi(z) 
u= Fi(y) for some y € A, since rng(F1) is transitive 
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YRx by the isomorphy condition for Fy; 
u= Fa(y) by IH 

Fo(y) € Fe(x) by the isomorphy condition for Fo 
u€ Fo(z). 


This concludes the proof. 


A relation R on A is a linear ordering if for all x,y,z € A 


AEZRIX irreflexivity, 
tRy — yRz— xRz transitivity, 
tRyVx=yVyR«x  trichotomy (or compatibility). 
R is a well-ordering if R is a well-founded linear ordering. 
REMARK. Every well-ordering R on A is extensional. To see this, assume 


VzEA. cR2 oO zRy. 


Then x = y by trichotomy, since from «Ry we obtain by assumption rR, 
contradicting irreflexivity, and similarly yR« entails a contradiction. 


COROLLARY 3.19. For every well-ordering R on A there is a unique 
isomorphism F of A onto a transitive class B. 


3.5. Ordinal classes and ordinals. We now study more closely the 
transitive classes that appear as images of well-orderings. 

A is an ordinal class if A is transitive and € M(A x A) is a well-ordering 
on A. Ordinal classes that happen to be sets are called ordinals. Define 


On := {a | x is an ordinal}. 


First we give a convenient characterization of ordinal classes. A is called 
connex if for allz,ye A 


TEeyVx=yVyee. 


For instance, w is connex by Lemma 3.7(d). Also every n is connex, since 
by Lemma 3.5(b), w is transitive. 

A class A is well-founded if € N(A x A) is a well-founded relation on A, 
ie., if 


Yaa CA-a4Q— Arearna= 9. 
We now show that in well-founded classes there can be no finite €-cycles. 


LEMMA 3.20. Let A be well-founded. Then for arbitrary 21,...,%€ A 
we can never have 


TCX] Cts © Uy © X. 
Proor. Assume 21 € %2 € ++: € &, € x1. Consider {x1,...,2,}. Since 
A is well-founded we may assume 719 {2%1,..., 2%} =. But this contradicts 


Ln © 2X. 


COROLLARY 3.21. A is an ordinal class iff A is transitive, connex and 
well-founded. 
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PROOF. — is clear; A is connex because of trichotomy. <—: We must 
show, for all x,y,z € A, 


cE€ x, 


avey Yo? Lez. 


Since A is connex, both propositions follow from Lemma 3.20. 


Here are some examples of ordinals. w is transitive by Lemma 3.5(b), 
connex as noted above and well-founded by the principle of least element 
(Theorem 3.10). So, w is an ordinal class. Since w by the Infinity Axiom is 
a set, w is even an ordinal. Also, n is transitive by Lemma 3.5(a), connex 
(see above) and well-founded; the latter follows with transitivity of w by the 
principle of least element (Theorem 3.10). 

Let us write Ord(A) for “A is an ordinal class”. We now show that 
ordinal classes have properties similar to those of natural numbers: the 
relation € has the properties of <, and the relation C has the properties of 
<. 

LEMMA 3.22. (a) Ord(.A) — Ord(B) — Ord(AN B). 

(b) Ord(A) — « € A> Ord(z). 
(c) Ord(A) — Ord(B) = (ACBe AEC BVA=B). 
(d) Ord(A) > Ord(B) = (AEC BVA=BVBEA). 


ProoF. (a). ANB transitive: 


zrEeyeEeANB 

xEeyeA and reyeB 
xEA and «EB 
zrEANB. 


ANB connex, well-founded: Clear. 
(b). & transitive: 


UwecveEexrEeA 
UuUcvEA 
wecA 


UEXVU=AULVEEU. 


From u = 2 it follows that u € v € u contradicting Lemma 3.20, and from 
x € u it follows that uc v € x € u, again contradicting Lemma 3.20. 

x connex, well-founded. Clear, for x C A. 

(c). <—. Clear, for B is transitive. —. Let A C B. Wlog A ¢ B. Choose 
x € B\ Asuch that 7 (B\ A) = 0 (this is possible, since B is well-founded); 
it suffices to show that « = A. 

x CA. Assume y € x, hence y € x € B. Then y € A, for 2 (B\.A) = 90. 

A Cx. Assume y € A. Then also y € B. It follows that x €e yVa = 
yVy € «. But the first two cases are impossible, for in both of them we 
obtain « € A. 

(d). Assume Ord(A) and Ord(B). Then by (a), Ord(AN B). Using (c) 


we obtain 


[(ANBe A)V(ANB=A)]A[(ANB EB) Vv (ANB =B). 
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Distributing yields 
(ANBEANB)V(AEB)V(BEA)V(A=B). 
But the first case AN B € ANB is impossible by Lemma 3.20. 


LEMMA 3.23. (a) Ord(On). 
(b) On is not a set. 
(c) On is the only proper ordinal class. 


PROOF. (a). On is transitive by Lemma 3.22(b) and it is connex by 
Lemma 3.22(d). On is well-founded: Let a C On, a 4 0. Choose x € a. 
Wlog xa #9. Since x is well-founded, there is a y € «Ma such that 
ynx2na=Q. It follows that y € a and yNa = 9; the latter holds since 
y C x because of y € x, x transitive. 

(b). Assume On is a set. Then On € On, contradicting Lemma 3.20. 

(c). Let Ord(A), A not a set. By Lemma 3.22(d) 


Ae OnvA=Onv One A. 


The first and the last case are excluded, for then A (or On, resp.) would be 
a set. 


LEMMA 3.24. (a) On is inductive, 
(b) n,w € On. 


PROOF. (a). 0 € On is clear. So let x € On. We must show x + 1 € On, 
that is «U {x} € On. 

x U {x} transitive: Assume ucve xU{z},soueuvexrorucv=z. 
In both cases it follows that u € a. 

x U {a} is connex: Assume u,v € x U {a}. Then 


uvEerV(weErAV=HRER)V(U=LAVEL)V(U=Vv=2) 
UVwEUVUH=VVUEU. 


xU{x} is well-founded: Let a C cU{x}, a #0. We must show Jyea (yN 
a=). Caseana #9. Then the claim follows from the well-foundedness 
of z. CaseaNax =. Then a = {x}, and we have 1 {x} = 0. 
(b). This has been proved above, after Corollary 3.21. 


LEMMA 3.25. 2,y€On>-a2+1l=ytlowr=y. 


PROOF. The proof is similar to the proof of the second Peano-Axiom in 
Theorem 3.8(b). 
z+l=y+l1 
reytlAyertl 
(reyAyEeu)Vr=y. 


Since the first case is impossible by Lemma 3.22, we have x = y. 


LEMMA 3.26. AC On > Ae OnvVUA= On. 
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Proor. It suffices to show Ord( A). UA is transitive: Let x € y € 
UA, sor €ye2z€A for some z. Then we have x € z € A, since A C On. 
Hence x € UA. 

UA is connex and well-founded: It suffices to prove [JA C On. So 
let « € UA, hence x € y € A for some y. Then x € y and y € On, so 
x € On. 


REMARK. If AC On, then (J A is the least upper bound of A w.r.t. the 
well-ordering € N(On x On) of On, for by definition of J A we have 


zEeArac|JA, 
(VaEA.a C y) —[JAc y. 


We therefore also write sup A for [J A. 
Here are some examples of ordinals: 


0 
1=0+1 
2=1+1 


w set by the Infinity Axiom 


w+1 

w+2 

w-2:=(f{wtn|new} by recursion on w 
w:241 

w:2+2 

w-3:=|(f{w-2+n|nEew} 


www :=|J{w-n|new} 
wt) 
w +2 
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w +w-2 
a +3 
. 

w 

w” 

wY +1 
and so on. 


a, 3,y will denote ordinals. 
a is a successor number if 3G(a = B41). a is a limit if a is neither 0 
nor a successor number. We write 


Lim(a) fora 0A =d6(a = 6 +1). 
Clearly for arbitrary a either a = 0 or a is successor number or ca is a limit. 


LEMMA 3.27. (a) Lim(a) 9a 40AVE.8E€a—> B+1€Ea. 
(b) Lim(w). 


(c) Lim(a) > w Ca. 


PROooF. (a). =: Let G@€ a. Then B+1leavB4+l=avae B+. 
The second case 6 + 1 = a is excluded assumption. In the third case it 
follows that a € BV a= #; but because of 3 € a both are impossible by 
Lemma 3.20. —. Let a 4 0 and assume V3.8 € a— G+1€a. Then ifa 
is not a limit, we must have a = 6+ 1. Then we obtain § € a, hence by 
assumption also G+ 1 €a and hence a € a, which is impossible. 

(b). Follows from (a), since w is inductive. 

(c). Assume Lim(a). We show n € a by induction on n. 0. We have 
0€aV0=aVae 0, where the cases two and three clearly are impossible. 
n+ 1. We have n € a by IH, hence n+ 1 € a by (a). 


LEMMA 3.28. (a) a = Ugea(8 + 1)- 
(b) For limits a we have a = sea B- 


PROOF. (a). C. Let G € a. The claim follows from 6 € G+1. D. Let 
Bea. Then 6+1Ca. 

(b). C. Let yea. Then ye y+ lea. D. Let ye Gea. We obtain 
yea. 
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THEOREM 3.29 (Transfinite induction on On; class form). 
(Va.aCBoaeB)-OncCeB. 


PROOF. This is a special case of the Induction Theorem 3.2 


COROLLARY 3.30 (Different forms of transfinite induction on On). First 
form: 


A(0) > (Va.A(a) > A(a+1)) 

— (Va.Lim(a) — (V8.6 € a — A(B)) > A(a)) 

— VaA(a). 
Second form: (Transfinite induction on On, using all predecessors). 

[\Va.(VB.8 € a — A(B)) > A(a)] — VaA(a). 

Third form: (Principle of least element for On). 
da A(a) > da.A(a) A 756.6 € aA A(B). 
ProoF. The third form follows from the second by contraposition. Also, 


the first form follows easily from the second. The second form follows from 
Theorem 3.29 using B := { a | A(a) }. 


THEOREM 3.31 (Transfinite recursion on On). Let G: V — V. Then 
there is exactly one function F: On — V such that for all a 


F(a) = G(F la). 


PROOF. This is a special case of the Recursion Theorem 3.3. 


COROLLARY 3.32. Assume G: V ~V,H:V —-V anda is a set. Then 
there is a unique function F: On > V such that 


F(a+1)=G(F(a)), 
F(a) =H(F fa) for a limit. 


I 
PRooF. First observe that U(a+ 1) = a, because of 
) 


v¥EVUle lodbye Beat 
oH dAB.yEeBCa 
oyea. 


For given a, G and H we shall find a G’ such that 


G'(0) =a, 
G'(Flat+1)=G(F(a)), 
G' (Fla) = H(F ta) for a limit. 


We define a function G’: V > V by 


a, otherwise; 
G'(x) = < G(x(LJdom(x))), if 3G(dom(x) = 6+ 1); 
H(x), if Lim(dom(z)). 
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By the Recursion Theorem 3.3 there is a unique F: On > V such that, for 
all a, 

F(a) =G'(F la). 
Clearly this property of F is equivalent to the equations above. 


3.6. Regularity Axiom, Von Neumann Levels, Rank. Recall the 

cumulative type structure: 

Level 0: — 

Level 1: @ 

Level 2: 0, {0} 

Level 3: 0, {0}, {{O}}, (0, {O}} 

and so on. 

Using ordinals we can now consider transfinite levels as well. The level w 
consists of all sets whose elements are formed on finite levels, and the level 
w-+1 consists of all sets whose elements are formed on finite levels or at level 


w, and so on. Generally we define the Von Neumann levels Va, as follows, 
by transfinite recursion on On. 


Vo = 0, 

Va+1 = P(Va), 

Vo = U Ve for a limit. 
Bea 


REMARK. More precisely, Va := F(a), where F: On — V is defined as 
follows, by transfinite recursion on On: 


F(0) = 0, 
F(a+1)=P(F(a)), 
F(a) = U mg(Ffa) for a limit. 
LEMMA 3.33. (a) Vq is transitive. 
(b) af B> Va € Vz. 
(c) a 6+ Va C Vp 
(d) Van On=a. 
PROOF. (a). (Transfinite) induction by a. 0. @ is transitive. a + 1. 
reye Vat = P(Va) 
rEeyovy 
rE Vy 
v Cc Va by TH 
EE Vet4. 
a limit. 
zeyeVe= |) Vp 
Bea 
reyeVe for some 3 € a 


rEeVeg by IH 
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xe Vy. 
(b). Induction by @. 0. Clear. 6+ 1. 
aéeéBt+l 
aé€Bora=h 
Va € Vg or Ve = Vg by IH 


Va © Vp by (a) 
Va € Vay. 
GB limit. 
ae ~ 
a+lep 
Ve € Vows © |) Vy = Vp. 
yER 


(c). Using aC Boae BVa=Z the claim follows from (a) and (b). 
(d). Induction on a. 0. Clear. a+ 1. 
BE Var BC Va 
@ 8CVaNOn=a_ by IH 
oo Beat. 
a limit. 
Van On = (LJ Ve) On 


Bea 


= U(Vsn On) 


Bea 


=Ue by IH 
Bea 
=a. 


This concludes the proof. 


We now show that the von Neumann levels exhaust the universe, which 
means that V = Uncon Va. However, this requires another axiom, the 
Regularity Axiom, which says that the relation € on V is well-founded, i.e., 


AxIoM (Regularity Axiom). 
Va.a# > Area(aNa=9). 


We want to assign to every set x an ordinal a, namely the least a such 
that « C V,. To this end we need the notion of the rank r(x) of a set 2, 
which is defined recursively by 


r(x) = Lt my) +llyec}. 


More precisely we define rn(x) := F(x), where F: V — V is defined as 
follows (using the Recursion Theorem 3.3 for well-founded relations): 


F(a) = |Jme(H(FI2x)) 
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mit 
H(z) :={(u,v +1) | (u,v) € z}. 
We first show that rn(x) has the property formulated above. 
LEMMA 3.34. (a) rn(x) € On. 
(b) 2S Vingwy 
(c) © CO Vyg > m(x) Ca. 


PROOF. (a). €-induction on x. We have rn(z) = U{ m(y)+1|yeur}e 
On, for by IH rn(y) € On for every y € a. 

(b). €-induction on 2. Let y € a. Then y C Vinyy) by IH, hence 
y € P(Vin(y) = Ven(y)+1 & Ven(x) because of rn(y) +1 © rn(zx). 

(c). Induction on a. Let x C Vy. We must show rn(x) = U{ rn(y) + 1 | 
yeu}Ca. Let y € «. We must show rn(y) +1 C a. Because of x C Va 
we have y € Vy. This implies y C Vg for some ( € a, for in case a = a’ + 1 
we have y € Var41 = P(VZ) and hence y C VJ, and in case a limit we have 
y © Va = Usea Va, hence y € Vg and therefore y C Vg for some 8 € a. - By 
TH it follows that rn(y) C 8, whence rn(y) € a. 


Now we obtain easily the proposition formulated above as our goal. 
COROLLARY 3.35. V = Uneon Va- 


PROOF. 2 is clear. C. For every x we have x C Vin(q) by Lemma 3.34(b), 
hence & € Vin(e)-41- 


Now V, can be characterized as the set of all sets of rank less than a. 
LEMMA 3.36. Vo = {2 | r(x) € a}. 


PRooF. 2. Let r(x) € a. Then x C Viq(q) implies t € Viniz)41 © Va- 
C. Induction on a. Case 0. Clear. Case a+ 1. Let x € Va41. Then 
x € P(Vq), hence x C Vy. For every y € x we have y € Va and hence rn(y) € 
a by IH, so rn(y) + 1 C a. Therefore rn(x) = U{rn(y) +1] yeu} Ca. 
Case a limit. Let « € Vy. Then x € Vg for some @ € a, hence rn(x) € B by 
TH, hence rn(x) € a. 


From « € y and x C y, resp., we can infer the corresponding relations 
between the ranks. 

LEMMA 3.37. (a)  € y > rn(z) € rn(y). 
(b) @ Cy r(x) C en(y). 


PROOF. (a). Because of rn(y) = U{ r(x) +1] a € y} this is clear. (b). 
For every z € x we have rn(z) € rn(y) by (a), hence rn(x) = U{ rn(z) + 1 
ze€xa}C rn(y). 


Moreover we can show that the sets a and V, both have rank a. 
LEMMA 3.38. (a) rn(@) =a. 

(b) m(V,) = a. 
PROOF. (a). Induction on a. We have rn(a) = Uf{ rn(3)+1]| 6 € a}, 


hence by IH rn(a) = U{ 6+1| 6 ¢€a}=a by Lemma 3.28(a). 
(b). We have 


m(Va) =(J{m(x) +1] 2 € Va} 
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= Ut r(z)+1|rm(a)€éa} by Lemma 3.36 
Ca. 
Conversely, let G € a. By (a), rn(G) = 6 € a, hence 
B=rm(B)E Ut r(x) +1] m(x) € a} =rn(VQ). 
This completes the proof. 


We finally show that a class A is a set if and only if the ranks of their 
elements can be bounded by an ordinal. 


LEMMA 3.39. A is set iff there is an a such that Vy€A (rn(y) € a). 


Proor. —. Let A =z. From Lemma 3.37(a) we obtain that rn(x) is 
the a we need. 
<—. Assume rn(x) € a for all y €¢ A. Then AC {y | m(y) ea} = 


Va. 


4. Cardinals 
We now introduce cardinals and develop their basic properties. 


4.1. Size Comparison Between Sets. Define 


la] < |b] :o Sf.f: a— b and f injective, 


Ja] = bl: Aff: ao 6, 
Ja] < |B] <> Jal < |b] A Jal F IO), 
ye Oa clio oe 
Two sets a and b are equinumerous if |a| = |b|. Notice that we did not define 


|a|, but only the relations |a| < |b], Ja] = |b] and |a| < OJ. 
The following properties are clear: 


la x b| = |bx al; 

[2Pe)] = |e; 

|P(a)| = [*{0, UI. 
THEOREM 4.1 (Cantor). |a| < |P(a)]. 


Proor. Clearly f: a — P(a), +> {x} is injective. Assume that we 
have g: a P(a). Consider 
b:i={a2|xrearx¢ g(x) }. 
Then 6 Ca, hence b = g(x) for some 2p € a. It follows that xo € g(xo) 
xo € g(a) and hence a contradiction. 


THEOREM 4.2 (Cantor, Bernstein). [f a C b C c and |a| = |c|, then 
|| = Iel- 


ProoF. Let f: c— a be bijective and r := c \ b. We recursively define 
g:w—V by 
g(0) =r, 
g(n + 1) = flg(n)]. 
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Let 
r= a(n) 


and define 7: c — b by 


i if 2 EF, 


x, ifa ¢ 7. 


It suffices to show that (a) rng(z) = b and (b) i is injective. Ad (a). Let 
x € b. We must show x € rng(i). Wlog let « € 7. Because of x € b we then 
have x ¢ g(0). Hence there is an n such that x € g(n +1) = f[g(n)], so 
x = f(y) =i(y) for some y EF. Ad (b). Leta Ay. Wlog xz e7, y¢F. But 
then i(x) € 7, i(y) ¢ F, hence i(x) F i(y). 


REMARK. The Theorem of Cantor and Bernstein can be seen as an 
application of the Fixed Point Theorem of Knaster-Tarski. 


COROLLARY 4.3. |a| < |b] — |b] < |a| — Jal = [0]. 


Proor. Let f: a > band g: b > a injective. Then (go f)[a] C g[b] Ca 
and |(go f)[a]| = |a]. By the Theorem of Cantor and Bernstein |b| = |g[b]| = 
lal. 


4.2. Cardinals, Aleph Function. A cardinal is defined to be an or- 
dinal that is not equinumerous to a smaller ordinal: 


a is a cardinal if VB<a (|| F lal). 
Here and later we write - because of Lemma 3.22 - a < ( for a € § and 
a<6foraC £. 
LEMMA 4.4. |n| = |m| > n =m. 
PRooF. Induction on n. 0. Clear. n+ 1. Let f:n+10m+1. We 


may assume f(n) = m. Hence f[n: n — m and therefore n = m by IH, 
hence alson+1=m-+1. 


COROLLARY 4.5. n is a cardinal. 
LEMMA 4.6. |n| 4 |w]. 


Proor. Assume |n| = |w|. Because of n C n+ 1 C w the Theorem of 
Cantor and Bernstein implies |n| = |n + 1], a contradiction. 


COROLLARY 4.7. w is a cardinal. 
LEMMA 4.8. w <a— |a+1| = |al. 
PrRooF. Define f: a— a+1 by 
a, ifx=0; 
f(@):=<n, ife=n+1, 


x, otherwise. 


Then f:acat+l. 
COROLLARY 4.9. [fw <a anda is a cardinal, then a is a limit. 


Proor. Assume a = 6+1. Then w < £8 < a, hence |6| = |B + 1], 
contradicting the assumption that a is a cardinal. 
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LEMMA 4.10. [fa is a set of cardinals, then sup(a) (‘= Ua) is a cardinal. 


PRoor. Otherwise there would be an a < sup(a) such that jal = 
|sup(a)|. Hence a € Ja and therefore a € (@ € a for some cardinal /. 
By the Theorem of Cantor and Bernstein from a C 6 C Ua and jal = |Ua 
it follows that |a| = |G|. Because of a € ( and ( a cardinal this is impossi- 
ble. 


We now show that for every ordinal there is a strictly bigger cardinal. 
More generally, even the following holds: 


THEOREM 4.11 (Hartogs). 
Vadla.V¥B<a(|B] < |al) A lal Z lal. 
a is the Hartogs number of a; it is denoted by H(a). 


Proor. Uniqueness. Clear. Existence. Let w := {(b,r) | 6 C aA 
r well-ordering on b} and Yo,r) the uniquely determined ordinal isomorphic 
to (b,r). Then {7o,r) | (b,7) € w} is a transitive subset of On, hence an 
ordinal a. We must show 
(a) B<a— |p| < lal, 
(b) lal £ lal. 
(a). Let 6 <a. Then ( is isomorphic to a 7) with (b,r) € w, hence there 
exists an f: 3 < b. 

(b). Assume f: a — a is injective. Then a = 7) for some b C a 
(b := rng(f)), hence a € a, a contradiction. 


REMARK. (a). The Hartogs number of a is a cardinal. For let a be the 
Hartogs number of a, 3 < a. If || = Ja], we would have |a| = || < Jal, a 
contradiction. 

(b). The Hartogs number of ( is the least cardinal a such that a > £. 

The aleph function 8: On — V is defined recursively by 


No =, 
Rat = H(Ra), 
Nq := sup{ Ng | 6 < a} for a limit. 


LEMMA 4.12 (Properties of &). (a) Ng is a cardinal. 
(b) @ <P Ne = Ne. 
(c) V3.8 cardinal — w < B — Ja(B =a). 


PRoorF. (a). Induction on a; clear. (b). Induction on 3. 0. Clear. 
DP 


a<P+1 
a<BpVa=6 
Na < Xe V Na = Ne 
Ra < No41. 

GB limit. 
a<fp 


a < vy for some 7 < 3 
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Na < Ry < Np. 


(c). Let a be minimal such that 8 < Ny. Such an a exists, for otherwise 
XN: On — @ would be injective. We show Ng < @ by cases on a. 0. Clear. 
a =a'+1. By the choice of a@ we have Xy < 3, hence Ng < 3. @ limit. 
By the choice of a we have X, < @ for all y < a, hence X, = sup{ ®y | y < 
a} <p. 

We show that every infinite ordinal is equinumerous to a cardinal. 

LEMMA 4.13. (V3>w)da(|G| = |Nal)- 


PRoor. Consider 6 := minf{fy | 7 < BA |y| = |S}. Clearly 6 is a 
cardinal. Moreover 6 > w, for otherwise 


6=n 

|n| = || 
nCn+1CB6 
In| = |n + lI, 


a contradiction. Hence 6 = Xq for some a, and therefore |5| = |3| = |Nal. 


4.3. Products of Cardinals. We now show that |Nq x Nal = |Ral- 
On the set On x On we define a relation < by 


(a, B) ~ (7,6) : max{a, 3} < max{y, 6} V 
(max{a, 3} = max{y,d}Aa<y¥)V 
(max{a, 3} = max{y,d}Aa= yAB <9). 
LEMMA 4.14. ~< is a well-ordering on On x On. 


PROOF. Clearly ~< is a linear ordering. To see the well-foundedness of 
~< consider an a C On x On such that a 4 0. Then 


DAA:={a| Ap, u((p, uw) € aA max{p, u} = a) } C On. 

Let ao := min(A). Then 
OA At :={p| Au((p, uw) € aA max{p, w} = ao) } C On. 

Let po := min(A,). Then 
DA Aza :={H| (P0, 4) € aA max{po, wu} = ag) } C On. 


Let uo := min(A2). Then clearly (po, uo) = min.(a). Finally notice that 
(a, 3) must be a set, for (a, 8) Cy x y with y := max{a, 6} +4 1. 


COROLLARY 4.15. On x On is isomorphic to On (w.r.t. < and € [On). 


Proor. By Lemma 4.14 ~ is a well-ordering on On x On. Hence by 
Corollary 3.19 there is an isomorphism onto a transitive and hence also 
ordinal class. This class cannot possibly be a set, for then On x On would be 
a set as well. But by Lemma 3.23(c) On is the only proper ordinal class. 


THEOREM 4.16. Ng X Nq is isomorphic to Na (w.r.t. the relations < on 
Na X Na and € on Xq). 
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Proor. Assume: Ja(Nq X Nq not isomorphic to Xq). Let 
ag := min{a | Xq X Ng not isomorphic to Ng }. 


Clearly ap #0. Since Nay X Nag and Na, are well-ordered sets, one of them 
must be isomorphic to a proper initial segment of the other. Therefore we 
distinguish two cases. 


Case (a). Nao is isomorphic to (3, with 8,7 < Nay. Choose b < Navy 
with 3,7 <6. Then (3,7) C6 x 6, and 


eed 


INaol = 1(6,7)| < [6 x 6] = |X; x 8] for some T < ag 
=-|N-| by choice of ao, 


hence a contradiction to Lemma 4.12(b). 
Case (b). Nay X Nag is isomorphic to 8 < Nay. Then 


[Rao | < [Nao x Rao = |3| < [Rao | 
[Rao | = Z|, 


hence a contradiction to the fact that Na, is a cardinal, 3 < Nay. 


CoroLary 4.17. (a) [Na x Xgl = | max{Xa, Na}. 
(b) n 40 |PRa| = |Nal. 


PROOF. (a). We may assume a < 7. Then 
[Na] < [Na x Nal < [No x Nal = [Nol 


(b). This follows easily from Theorem 4.16, by induction on n. 


5. The Axiom of Choice 


5.1. Axiom of Choice, Well Ordering Theorem, Zorn’s Lemma. 
A relation R on A is a partial ordering if for all x,y,z € A 


ARI, irreflexivity 


tRy — yRz— «cRz, transitivity. 


An element x € A is mazimal if there is no y € A such that cRy. Let 
BC A. An element xz € A is an upper bound of B if 


VyEB.yRav y =x. 


THEOREM 5.1. The following are equivalent. 
(a) The axiom of choice (AC) 


Vad ¢dxif.f:2— EZ (Vyex)(f(y) € y). 
(b) The well ordering theorem (WO) 


Vadr(r is a well ordering on a). 


(c) Zorn’s Lemma (ZL): Let (P,<) be a non empty partial ordering, with 
the property that every (by <) linearly ordered subset L C P has an 
upper bound in P. Then P has a maximal element. 
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Proor. (ZL) — (WO). Let a be given, and define 
P:={f | da(f: a— a injective) } C P(H(a) x a). 


P is partially ordered by proper inclusion ¢. Let L C P be linearly ordered. 
Then UL € P, hence ) L is an upper bound of L. Zorn’s Lemma then gives 
a maximal element fo € P. Clearly fo is a bijection of an ordinal ag onto 
a, hence fo induces a well ordering on a. 

(WO) — (AC). Let 0 ¢ x. By (WO) there is a well ordering < on Ux. 
Clearly < induces a well ordering on every y € x. Define 


fit- U v. 
yr min <(y) € y. 
(AC) — (ZL). Let < be a partial ordering on P 4 0. Assume that every 
subset LE C P linearly ordered by < has an upper bound in P. By (AC) 


there is a choice function f on P(P) \ {@}. Let z ¢ P be arbitrary, and 
define 


F: On V 
F(a) = fdyly¢€ P\ Fla] A y upper bound of Flaj}), if {...} 40; 
7 Z, otherwise. 


Then there is a p such that F(p) = z, for otherwise F: On — P would be 
injective, contradicting our assumption that P is a set. Let po := min{ p | 
F(p) =z}. F[po] is linearly ordered, and we have F [pp] C P. By assump- 
tion there is an upper bound yo € P of F [po]. We show that yo is a maximal 
element in P. So assume yo < y for some y € P. Then y is an upper bound 
of F[po] and y ¢ F [po]. But this contradicts the definition of po. 


From now on we will assume the axiom of choice; however, we will mark 
every theorem and every definition depending on it by (AC). 

(AC) clearly is equivalent to its special case where every two elements 
Y1,y2 © x are disjoint. We hence note the following equivalent to the axiom 
of choice: 


LEMMA 5.2. The following are equivalent 
(a) The axiom of choice (AC). 
(b) For every surjective g: a — b there is an injective f: b > a such that 


(Vaeb)(g( fa) = x). 


PROOF. (a) => (b). Let g: 6 — a surjective. By (AC) there is a well- 
ordering < of b. Define f: a b by f(x) :=mine{y| ye bAg(y) =z}. 

(b) = (a). We may assume x 4 0 and (Vy1, yo € )(y1 Ny2 = Y). Define 
g: Ux => & by g(z) := the unique y € x such that z € y. Then g is 
surjective. By (b) there is an injective f: « > Lx such that g(fy) = y for 
all y € x, hence f(y) € y. 


5.2. Cardinality. a is the cardinality of a if a is a cardinal and there 
is a bijection f: a> a. 


THEOREM 5.3 (AC). Every set has a unique cardinality. 
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PROOF. Uniqueness. Clear. Existence. Let < be a well-ordering on a. 
Then there is a y such that a is isomorphic to y. Hence {7 | |r| = jal} 40 
and therefore min{ 7 | |7| = |a| } is a cardinal. 


Clearly |a| = |6| iff the cardinality of a equals the cardinality of b, and 
|a| < |b] iff the cardinality of a is less than or equal to the cardinality of b. 
Therefore we can use |a| as a notation for the cardinality of a. 

A set a is defined to be finite if a can be mapped bijectively onto a 
natural number, and infinite otherwise. Using (AC) it follows that a is 
finite iff Ja] <w. 

LEMMA 5.4 (AC). If a,b #0 and a or b is infinite, then 

la x b| = max{|al, |b] }. 
ProorF. Let |a| = max{|a|, |b|}. Then 


Ja] < Ja x | = |Jal x bl] < Ila] x Jal] = Jal. 


THEOREM 5.5 (AC). Let I be infinite or supje,|Ai| be infinite. Then 
(a) |Ujer Ail < max{|J], supjez |Ail}- 
(b) If in addition (vi € I)(A; 4 0) and (Vi,j € Ti Fj — AVN A; = 9), 
then equality holds. 


PROOF. (a). We may assume & := supje,|Ai| # 0. Choose a well- 
ordering < of J and define w.r.t. this well-ordering 


f: 4i > UK x Ad), 
iel iel 
f(z)= (min{ 4 eT | we Az} a): 
Clearly f is injective. Hence 
IU Ail s IU x 40) 
iel iel 
< |U({a x «) 
wel 
=|[xk| 
= max{|Z|, «}. 

(b). Because of (a) it suffices to show that |J|,|Ai| < | Uj, Ai]. The se- 
cond estimate is clear. For the first one choose a well-ordering < of Uj;<7 Ai 
and define f: I > U,-, Ai by f(i) := ming {a | « € A; }. By our assumption 
f is injective. 


A set a is Dedekind-finite if a cannot be mapped bijectively onto a proper 
subset b of a, otherwise Dedekind-infinite. 


THEOREM 5.6 (AC). A set a is Dedekind-infinite iff a is infinite. 
Proor. —. Let b C a and f:a<— b. Assume fal < w, say |a| = n. 


Then there is ac € n and some g: n ~— c. We show by induction on n that 
this is impossible: 


VYna(ae € n)Ag(g: nc). 
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0. Clear. n+1. Let g: n+1< candc € n+1. We may assume n ¢ rng(g[n). 
It follows that g[n: n << c\ {n} Cn and hence a contradiction to the IH. 
<—. Let g: w > a be injective and h: g[w] — g[w \ 1] defined by 


h={(g(n),g(n+1))|n ew}. 
Define f: a — (a \ {9(0)}) by 
no= {i if ea\ glu); 


h(x), otherwise. 


5.3. Regular and Singular Cardinals. Let «, denote cardinals > 
w. In this section we shall always assume the Axiom of Choice (AC). 
DEFINITION 5.7 (AC). (a) « C « is confinal in « if sup(x) = k. 
(b) cf(«) := min{|z| | 2 C « and x confinal with «} is the confinality of k. 
(c) K is regular if cf(K) = kK. 
(d) « is singular if cf(K) < kK. 
THEOREM 5.8 (AC). (a) w = No is regular. 
(b) Na+ is regular. 
(c) If B is a limit and B < Ng, then Ng is singular. 


PROOF. (a). Assume w is singular, that is cf(w) < w. Then there is an 
x Cw such that |2| =n and sup(z) = w. But this is impossible (proof by 
induction on n). 

(b). Assume Xq41 is singular. Then cf(Nai1) < Na. Hence there is an 
x C Xa41 such that |z| < Xq and sup(x) = Xa41. But then 


Rati = IU «| 
< max{|z|,sup{|y| | yeu}} by Theorem 5.5(a) 
<a, 
a contradiction. 


(c). Let @ be a limit such that 6 < Ng. Then we have Ng = sup{ ®, 
y < G} and moreover |{%, | y < 3}| = |G] < Ne. Hence Ng is singular. 


By definition for every infinite cardinal « there is a subset x C & whose 
cardinality equals cf(«), hence which can be mapped bijectively onto cf(x). 
We now show that one can even assume that this bijection is an isomorphism. 

LEMMA 5.9 (AC). Let & be an infinite cardinal. Then there exists a 
subset x C & confinal in « that is isomorphic to cf(K). 


Proor. Let y C &, sup(y) = k, |y| = cf(K) and g: cf(k) — y. By 
transfinite recursion we define 
F: On V, 
F(a) := sup(F[a] U gla]) + 1. 
Let f := Ffcf(K). One can see easily 


(a) a< PB <cf(k) > f(a) < f(B) A g(a) < f(P). 
(b) me(f) Cr 
(c) rng(f) is confinal with k. 
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rng(f) is the x we are looking for. 


COROLLARY 5.10 (AC). If & is an infinite cardinal, then cf(K) is a re- 
gular cardinal. 


PRooF. cf(cf(K)) < cf(«) is clear. We must show cf(K) < cf(cf(«)). By 
the lemma above there are x, f such that x C «, sup(z) = « and f: cf(K) o 
x isomorphism. Moreover there is y C cf(«) such that sup(y) = cf(«) and 
|y| = cf(cf(«K)). One can see easily that { f(a) | a € y} is confinal with x. 
Hence 


cf(K) < |{ f(a) |ae y}| 
= ly! 
= cf(cf(K)) 


This concludes the proof. 


THEOREM 5.11 (Konig). Let « be an infinite cardinal. Then & < |x]. 


Proor. & = |!«| < |x| is clear. Hence it suffices to derive a contra- 
diction from the assumption that there is a bijection f: to f)«. Accord- 
ing to Lemma 5.9 there exists x C « such that sup(x) = « and moreover an 
isomorphism g: cf(kK) < x. For every a < cf(K) we therefore have g(a) < k 
and hence 

lt Fv) (@) | ¥ < g(a) HS |gl@)| < ks, 
hence { f(7)(a) | ¥ < g(a) } Gx. Let 
h: cf(«) > kK, 
h(a) == min(\ {£6)(@) 17 < g(a)}) 
We obtain the desired contradiction by showing that f(y) # h for all y < k. 
So let 7 < &. Choose a < cf(K) such that y < g(a). Then h(a) 4 f(y)(a@) 
by construction of h. 


5.4. Cardinal Powers, Continuum Hypothesis. In this section we 
again assume (AC). We define 


Nh? = PNG. 
Later we will introduce powers of ordinals as well. It should always be clear 
from the context whether we mean ordinal or cardinal power. 
THEOREM 5.12 (AC). (a) Ng <cf(Na) > Na < NO? < |P(Ra)| 
(b) cf(Na) < Ng < Na > Na < NY? < [PARa)| 
(c) Ra S Ny Na’ = |PCRy)]. 
PROOF. (a). 
Ro 


IA 


Xena 

*a (a {0, 1})| 

NaxRa fQ), 1} 

Xa f0,1}| because Xg < Xa 
= |PQa)|- 


Il 1A 
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(b). 
Na < F(Ra NS Konig’s Theorem 
< [Reno 
<|P(X.)| as in (a). 


P(Na)| = [8 {0, 1} 
Xena 
NexNa 9. 1}| 
= |*°{0, 1} 

= |P(®e)l- 


IN IA 


This concludes the proof. 


One can say much more about cardinal powers if one assumes the so- 
called continuum hypothesis: 


|P(No)| = Ni. (CH) 


An obvious generalization to all cardinals is the generalized continuum hy- 
pothesis: 
P(Ra)] = Nast. (GCH) 

It is an open problem whether the continuum hypothesis holds in the cu- 
mulative type structure (No. 1 in Hilbert’s list of mathematical problems, 
posed in a lecture at the international congress of mathematicians in Paris 
1900). However, it is known that continuum hypothesis is independent from 
the other axioms of set theory. We shall always indicate use of (CH) or 
(GCH). 

THEOREM 5.13 (GCH). (a) Ng < cf(Na) > No = Rd”. 
(b) cf(Na) < No < Na JNM? = Nog. 
(c) Na < Ne NN? = Nga. 

PRooF. (b) and (c) follow with (GCH) from the previous theorem. 

(a). Let 8g < cf(Nq). First note that 


MeRy= | Jy | eRe) 


This can be seen as follows. D is clear. C. Let f: Ng — Na. Because 
of | f[Ng]| < Xe < cf(Na) we have sup(f[Ng]) < y < Na for some y, hence 


f:%e- 7. 
This gives 


Na < [Xe Na previous theorem 
= ILE Xey|y¥<Na}| by the note above 
< max{|Nq|, sup |*#7|} by Theorem 5.5(a) 
<a 


Hence it suffices to show that [Re-y| <q for y < Ng. So let y < Na. 
[Sey] < |Re*7{0, 1} 
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< |86*¥510,1}| for some 6 with |y| <5 < Ne 


- IPRs) if <8 
~ )|P(Xs)| if 5 < 8 
_ N541 ifb<6 
~ (Noir if6<B 
< Ny. 


This concludes the proof. 


6. Ordinal Arithmetic 


We define addition, multiplication and exponentiation for ordinals and 
prove their basic properties. We also treat Cantor’s normal form. 


6.1. Ordinal Addition. Let 
at+0:=a, 
a+(6+1):=(a+)+1, 
a+6:=supfa+y|7<@} if G limit. 
More precisely, define s,: On — V by 


so(B) = (Jrng(sal8) if 6 limit 


and then let a + 6 := Sq(8). 
LEMMA 6.1 (Properties of Ordinal Addition). (a) a+ € On. 


(b) 0+ 6=8. 

(c) da, peekre sey, 

(d) B< Ge Say, 

(e) The are a, 3,7 such thata < B, butat+y<B+y7. 
(f)a< Oty SB 

(g) For a se G3 there is a unique y such thata+7= 8. 
(h) If B is a limit, then so isat QB. 

(i) (@+@)+y=a+ (B+). 


PRooF. (a). Induction on @. Case 0. Clear. Case G+ 1. Then 
a+(8+1) =(a+)+1€ On, for by IH a+ 6 € On. Case £ limit. Then 
a+ 8=sup{a+y7|y7< 8} € On, for by IH a+7€ On for all 7 < £. 

(b). Induction on 6. Case 0. Clear. Case 6+ 1. Then 0+(G+1) = 
(0+ 8) +1=6-+1, for by IH0+=Q. Case 6G limit. Then 


0+8=sup{0+y7|7< BF} 


=sup{y|7< 8} by TH 
= Ue 
= £, because @ is a limit. 


(c). l+w=sup{l+n|new}=wFAw+l. 
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(d). Induction on y. Case 0. Clear. Case y+ 1. Then 


ees ae oe 
B<yVB=7, 
at+B<at+yVat+B=a+y7 by IH, 


a+B<saty<(aty+l=at+(7+)D). 

Case ¥y limit. Let 6 < y, hence @ < 6 for some 6 < y. Thena+8<a+6 
by IH, hence a+ 6 < sup{at+6|d6<y}=a+t+y¥. 

(ce). O< 1, butO+w=w=1+w 

(f). We first remark that there can be no @ such that a < B<a+tl, 
for otherwise we would have in case 3 € a the contradiction G € a € GB and 
in case 3 = a the contradiction a@ € a. As a second preliminary remark we 
note that 

ax<6G->a4+1<641, 

for in case 8+1 < a+1 we would have a < 6+1 < a+1, which cannot be the 
case (as we have just seen). — We now show the claim a < 8 ~ a+y7< B+7 
by induction on y. Case 0. Clear. Case y+ 1. Then 


Ory Bey by TH, 
(a+y)+1<(G+-7)+1 by the second preliminary remark, 
at(y+1)<64+(7+1)_ by definition. 


Case 7 limit. Then 


a+td<68+6 for all 6 < y, by IH, 
a+é<sup{G+d|d<~y} 
suptat+do|d<y}<sup{G+d|d<y} 

aty<6+y by definition. 


(g). Uniqueness of y follows from (d). Existence: Let a < (3. By (b) 
and (f) B=0+6<a+Q. Let y be the least ordinal such that G <a+7. 
We show that 6 =a+v+7y. Casey =0. Then G<a+y=a+0=aK< 8, 
hence G=a+y. Case y=7/+1. Thena+7 < G, hence (a+7)+1< 8 
by the first preliminary remark for (f) and hence a + y = 3. Case 7¥ limit. 
Then a+ 06 < @ for all 6 < y, hence a+ y=sup{at+6|d<y}< Gand 
hence a+ y= P. 

(h). Let @ limit. We use the characterization of limits in Lemma 3.27(a). 
a+ #40: Because of 0 < a we have 0 < 6 =0+6 <a+6Q by (f). 
y¥<atB—oy4t1l<at+ pf: Letty <a+6=sup{at+6|d6< GB}, hence 
7 <a+o6 for some 6 < 6, hence y+1<a+(d+1) with 6+1< £, hence 
y+1<sup{a+d6|6< BG}. 

(i). Induction on y. Case 0. Clear. Case y+ 1. Then 


(a+ 8)+(y¥+1) =([(a+6)+7+1 
a+ (8 +)) +1 -by 1H 
(B+) +1] 
+ [8+ (y+) 
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Case 7¥ limit. By (h) also 6+ 7 is a limit. Hence 


(a+ 8) +y=sup{(a+B)+d|d<y} 
=sup{a+(6@+46)|d<y} by IH 
=sup{ate|e<@6+y7} — see below 
=a+(6+7%). 


The equality of both suprema can be seen as follows. If ¢ < G+ 7, then 
é < 6+6 for some 6 < ¥ (by definition of 6+) and hence ate < a+(G+6). 
If conversely 6 < y, then G+6< G+7, hencea+(G+6) =a-+e for some 
E<@8+y (take e:= 6+). 


6.2. Ordinal Multiplication. Ordinal multiplication is defined by 

a:-0:=0, 
a-(8+1) :=(a-8)+a, 
a-@:=sup{a-y|y <6} if @ limit. 

We write af for a: £. 

LEMMA 6.2 (Properties of Ordinal Multiplication). (a) @@ € On. 

06=0,16= 0. 

Ja, Blas # Ba). 

0<a-B<yrap<ay. 

There are a, 3,7 such that 0 < y anda < B, but ay € By. 


) 
) 
) 
) 
pass — ays By 

) If0 <a and G is a limit, then so is af. 
) 

) 

) 

) 

) 


a(B+ 7) = a8 + ay. 
There are a, 3,7 such that (a+ B)y 4 ay + By. 
(j) aB=0-a=0VGB=0. 


PRooF. (a). Induction on @. Case 0. Clear. Case 6+ 1. Then 
a(3 +1) = (a8) +a € On, for by IH a@ € On. Case G limit. Then 
a = sup{ay | 7 < @}¢€ On, for by IH ay € On for all y < £. 

(b). 0G = 0: Induction on 8. Case 0. Clear. Case 3+ 1. Then 
0(6 +1) = (08) + 0 = 0 by IH. Case @ limit. 06 = sup{0y | 7 < 6G} =0 
by IH. — 16 = @: Induction on @. Case 0. Clear. Case 6+ 1. Then 
1(8 +1) = (18) +1=6+4+1 by IH. Case @ limit. 16 = sup{ly|y< 6} = 
sup{ y| 7 < 2} = @ by IH. 

(c). Fist note that for all n € w we have nw = sup{nm|m<w}=w. 
This implies 2w = w, but w2=w(14+1)=wl+w=w+w>w. 

(d). Let 0 < a. We show 8 < y > af < ay by induction on y. Case 0. 
Clear. Case y+ 1. Then 


Ce ale 
B<yVB=1%, 
aBb<ayV as =ay by IH, 


aB <ay< (ay) +a=a(y+1). 
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Case y limit. Let @ < y, hence @ < 6 for some 6 < y. Then af < ad by 
TH, hence af < sup{ad|d<y}=ay. 
(ec). We have 0 <w and 1 < 2, but lw =w = 2w. 
(f). We show the claim a < 6 — ay < Gy by induction on y. Case 0. 
Clear. Case y+ 1. Then 
ay < By by TH, 
(ay) +a < (By) +a< (By) +8 by Lemma 6.1(f) and (d) 
a(y+1) < Biv +1) by definition. 
Case ¥y limit. Then 


ad < 86 for all 6 < y, by IH, 
ad < sup{ 36 |6 < y}, 

sup{ ad |b <y} <sup{6d|d< 7}, 

ay < By by definition. 

(g). Let 0 < a and @ limit. For the proof of a@ limit we again use the 
characterization of limits in Lemma 3.27(a). aG 4 0: Because of 1 < a and 
w < B we have 0 <w = lw < af by (f). y < aB — y+1 < af: Let 
7 <a =sup{ad | 6 < B}, hence y < ad for some 6 < 6, hence y+ 1 < 
ad+1<ad+a=a(d+1) with 6+1< B, hence y+1 < sup{ ad | 5 < GB}. 

(h). We must show a(G+ 7) =aG+ay. We may assume let 0 < a. We 
emply induction on y. Case 0. Clear. Case y+ 1. Then 

al6+(yt+ J =al(6+7) +4] 
=a(B+y)+a 
=(ab+ay)+a by IH 
= a8 + (ary + a) 
=apta(y+1). 
Case 7 limit. By (g) ay is a limit as well. We obtain 


a(B+7) =sup{ad|d6<6+~7} 
=sup{a(B+e)|e<y} 
=sup{aGt+ae|le<y} by IH 
=sup{aB+d|d<ay} 
=aBt ay. 


). 1+ lw = 2w =w, but lw + lw =w+w. 
). If 0 <a, 8, hence 1 < a,Z, then0<1-1< af. 
k). Induction on 7. We may assume 3 # 0. Case 0. Clear. Case 


(aB)(7 + 1) = (a8)y + a8 
(Sy) +a by IA 
(Gy+ 8) — by (h) 
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Case 7 limit. By (g) Gy is a limit as well. We obtain 


(a3)y = sup{ (af)d | 6 < y} 
= sup{ a(6)|d6<y} by IH 
= sup{ ae |e < By} 
= a((7). 

(1). Existence: Let 0 < 6, hence 1 < (£ and hence a = la < fa. 
Let y be the least ordinal such that a < By. Case a = Py. Let p = 0. 
Case a < Gy. If y = y¥ +1, then BY < a. Hence there is a p such 
that 87’ + p = a. Moreover, p < 3, because from p > @ is follows that 
a = By + p> BY +B = Bi +1) = f4, contradicting our assumption. If 
y is a limit, then a < By = sup{ G6 | 6 < y}, hence a < (6 for some 6 < 4, 
a contradiction. 

Uniqueness: Assume 871+ p1 = 8y2+ p2 with p1, p2 < 3. Ifsay y1 < 7, 
then 

By t+ pi < By + 6 
= B(m +1) 
< P 2 
< By2 + pe 


hence we have a contradiction. Therefore 71 = y2, and hence pi = poe. 


COROLLARY 6.3. Every ordinal a can be written uniquely in the form 
a=wytn. Heren=0 iffa=0 ora is a limit. 


PROOF. It remains to be shown that for every y either wy = 0 or wy isa 
limit. In case y = 0 this is clear. In case y+1, the ordinal w(y+1) = wy+w is 
a limit by Lemma 6.1(h). If is a limit, then so is wy (by Lemma 6.2(g)). 


6.3. Ordinal Exponentiation. Ordinal exponentiation is defined by 


a= 
1, otherwise, 
a?*t := oF a, 
a := sup{a? | 7 < 8} if G limit. 
LEMMA 6.4 (Properties of Ordinal Exponentiation). (a) a? € On. 


) 
)l<as6<y 08 <a’. 

) There are a, B,y such thatl <y andl <a< Pf, but a? € B’. 
JasBoar< pf. 

) Ifl1 <a and BG is a limit, then so is a. 

) aft7 = aFa’. 

) 
) 


ProoF. (a). Induction on @. Case 0. Clear. Case G+ 1. Then 
a+! = (aw?)a € On, for by IH a € On. Case 2 limit. Then a? = sup{ a? | 
y < @} € On, for by IH a7 € On for all 7 < f. 
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(b). 0° = 0: Induction on 3. Case 0. 0° = 0 holds by definition. Case 
B+1. Then 0°+! = (0%)0 = 0. Case G limit. 0° = sup{07 | 7 < 8} =0 
by IH. — 1° = 1: Induction on 3. Case 0. Clear. Case 6+ 1. Then 
19+! = (19)1 = 1 by IH. Case @ limit. 1° = sup{17 | y < 6} =sup{1 | 
y < 8} =1 by IH. 

(c). Let 1 <a. We show 8 < y — a® < a’ by induction on y. Case 0. 
Clear. Case y+ 1. Then 


B<ytl, 
B<yVB=y, 
a? <a Va® = a7 by IH, 


ah <al<att+at <a, 
Case ¥ limit. Let 3 < y, hence @ < 6 for some 6 < y. Then a? < a® by 
TH, hence a? < sup{a? | 6 <y} =a". 

(d). For 1 < n we have n” = sup{n™ | m < w} = w and hence 
20 SW 3", 

(ec). We show the claim a < 6 > a? < 7 by induction on y. Case 0. 
Clear. Case y+ 1. Let a < @. Then 

OSD by TH, 
at = aa 
< Pa 
< p'B 
= grt, 
Case ¥y limit. Let again a < @. Then 
a 5" for all 6 < 7, by IH, 
a? < sup{ 9° |5< 7} 
sup{ a? | 5 < y} <sup{?|d<y} 
a’ < pT by definition. 

(f). Let 1 < a and @ limit. For the proof of a? limit we again use the 
characterization of limits in Lemma 3.27(a). ae #0: Because of 1 < a we 
have 1= 19 <a. y< a® sy7+1< a: Let y < a? = sup{a® | 65 < B}, 
hence y < a° for some 6 < 3, hence y+ 1 < a® +1 < a®2 < aot! with 
6+1< , hence y+1 <sup{a® | 6 < B}. 

(g). We must show a?+7 = a%a7. We may assume a 4 0,1. The proof 
is by induction on 7. Case 0. Clear. Case y+ 1. Then 

ob+141 — gata by IH 
=a%aqrt!, 
Case 7¥ limit. 
a®t7 = sup{a® |d< B+y} 
= supfat |e <7} 
=sup{a%a® |e<y} by IH 
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=sup{a?5|5 <a} 
=a’a’, 


(h). We must show w97 = (a). We may assume a # 0,1 and 6 F 0. 
The proof is by induction on 7. Case 0. Clear. Case y+ 1. Then 


obOH) = oth 
=(a’)’a" by IH 
= (of 4, 
Case ¥ limit. Because of a 4 0,1 and 3 4 0 we know that a®7 and (a)? 
are limits. Hence 
a7 = sup{a? | 6 < By} 
=sup{a* Je<7} 
=sup{(@")° |e<7} by IH 
= (a), 


(i). Let 1 < a. We show 6 < a by induction on 3. Case 0. Clear. 
Case 3+ 1. Then 3 < a? by IH, hence 


Pei<a’ +i 
2 ypicne 
< att, 


Case (3 limit. 


6B =sup{y|y< #} 
<sup{a7|y< 6} by IH 


=a’, 


This concludes the proof. 


6.4. Cantor Normal Form. 
THEOREM 6.5 (Cantor Normal Form). Let y > 2. Every a can be written 
uniquely in the form 


a= By +e +O" Bn where a> ay >++: > Ay and0 < Bi <7. 


Proor. Existence. Induction on a. Let 5 be minimal such that a < 7°; 
such a 6 exists since a < y*. But 6 cannot be a limit, for otherwise a < 7° 
for some « < 6. If 6 = 0, then a = O and the claim is trivial. So let 
6=a;,+1, hence 

pica cyt 


Division with remainder gives 
a= 7"f+p with p<y™. 
Clearly 0 < 6, < y. Now if p = 0 we are done. Otherwise we have 
p=? By +--+ + 7°" Bn by TH. 
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We still must show a; > a2. But this holds, because ag > ay entails 
p=y? > ¥™, a contradiction. 
Uniqueness. Let 


PBL be bY By = PBL be + Bly 


and assume that both representations are different. Since no such sum can 
extend the other, we must have i < n,m such that (aj, 3;) 4 (ai, 31). By 
Lemma 6.1(d) we can assume i = 1. First we have 


jie Wa eccearas saan enna ake 


<1 Br ee Hy By_1 +9" tt since Bn < 7 
SES Maa! Ben ee i aa C0 | for An < Qn—1 
< 1 By aes ia ae 


ay (Gi +1). 


Now if e.g. a1 < a4, then we would have 7%! 31 +---+7°" Bn <y¥ (G1 +1) < 
yutl < 4%, which cannot be. Hence a; = a’. If e.g. 6, < 6, then we 
would have y° 8, +--+: + 7°" Bn < y%(G1+1) < 7% 6}, which again cannot 
be the case. Hence 31 = (34. 


COROLLARY 6.6 (Cantor Normal Form With Base w). Every a can be 
written uniquely in the form 


a=wtt---+w% withha>ay>--:> an. 


An ordinal a is an additive principal number when a 4 0 and B+7 <a 
for Byy <a. 


COROLLARY 6.7. Additive principal numbers are exactly the ordinals of 
the form ws. 


PROOF. This follows from Cantor’s normal form with base w. 


COROLLARY 6.8 (Cantor Normal Form With Base 2). Every a can be 
written uniquely in the form 


a= QM He $2 — With A> AY > ++ > An. 


Let wo := 1, weyy = wk and €9 := supge, Wp. Notice that ¢9 is the 
least ordinal @ such that w% = a. 


7. Normal Functions 


In [29] Veblen investigated the notion of a continuous monotonic func- 
tion on a segment of the ordinals, and introduced a certain hierarchy of 


normal functions. His goal was to generalize Cantor’s theory of e-numbers 
(from [5]). 
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7.1. Closed Unbounded Classes. Let 2 be a regular cardinal > w 
or 2 = On. An important example is 2 = Xj, that is the case where (2 is the 
set of all countable ordinals. Let a, G,7,6,¢,€,7, ¢ denote elements of Q. A 
function gy: Q — 2 is monotone if a < @ implies ya < yf. ~ is continuous 
if pa = supeca ¥& for every limit a. y is normal if yp is monotone and 
continuous. 


LEMMA 7.1. For every monotone function p we have a < pa. 


PRooF. Induction on a. Case 0. 0 < yO. Casea+1. a < ya < 
y(a +1). Case a limit. a = supeca € < suPeca YE < pa. 


A class B C 2 is bounded if sup(B) € Q. A class A C Q is closed if 
for every bounded subclass B C A we have sup(S) € A. Closed unbounded 
classes A C 2 are called normal or closed unbounded in 2 (club for short). 

If for instance 2 = 01, then every B CC: is a set, and B is bounded iff 
B is countable. If Q = On, then B is bounded iff B is a set. 

By Corollary 3.19 (to the Isomorphy Theorem of Mostowski) for every 
A C On we have a uniquely determined isomorphism of an ordinal class onto 
A, that is an f: On — A (or f: a — A). This isomorphism is called the 
ordering function of A. Notice that f is the monotone enumeration of A. 


LEMMA 7.2. The range of a normal function is a normal class. Con- 
versely, the ordering function of a normal class is a normal function. 


Proor. Let y be a normal function. y[Q] is unbounded, since for every 
a we have a < ya. We now show that y/[Q] is closed. So let B= { y€ | € € 
A} be bounded, i.e., sup(B) € 2. Because of € < v€ then also A is bounded. 
We must show sup(B) = ya for some a. If A has a maximal element we 
are done. Otherwise a := sup(A) is a limit. Then ya = supeca pf = 
supge 4 VE = sup(B). Conversely, let A be closed and unbounded. We 
define a function y: Q — A by transfinite recursion, as follows. 


pa :=min{yE€ A| VEE <a vw < y¥}. 


y is well defined, since A is unbounded. Clearly y is the ordering function 
of A and hence monotone. It remains to be shown that y is continuous. 
So let a be a limit. Since yla] is bounded (this follows from y€ < ya 
for € < qa) and A is closed, we have supgeg pE € A, hence by definition 
Pa = SUPgeg YE. 


LEMMA 7.3. The fixed points of a normal function form a normal class. 


Proor. (Cf. Cantor [5, p. 242]). Let y be a normal function. For every 
ordinal a we can construct a fixed point 6 > a of y by 


2 :=sup{ y"a| ne N}. 


Hence the class of fixed points of y is unbounded. It is closed as well, since 
for every class B of fixed points of y we have y(sup(B)) = sup{ ya | a € 
B}=sup{a|a€B}=sup(B), i-e., sup(B) is a fixed point of y. 
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7.2. The Veblen Hierarchy of Normal Functions. The ordering 
function of the class of fixed points of a normal function y has been called 
by Veblen the first derivative y' of y. For example, the first derivative of 
the function w‘ is the function é¢. 

LEMMA 7.4 (Veblen). Let (Ay).<g with 3 limit be a decreasing sequence 
of normal classes. Then the intersection (|g Ay is normal as well. 


Proor. Unboundedness. Let a be given and 6, := min{€ € A, | € > 
a}. Then (4,),<g is weakly monotonic. Let 6 := sup,cg 67. Then 6 € Ay 
for every y < 8, since the A, decrease. Hence a < 6 € f\,<g Ay. 

Closedness. Let B C ()\,<3A,, B bounded. Then B C A, for every 
_ < 8 and therefore sup(B) € Ay. Hence sup(B) € (|, 23 Ay- 


We now define the Veblen hierarchy of normal functions. It is based on 
an arbitrary given normal function gy: Q — Q. We use transfinite recursion 
to define for every @ € 2 a normal function yg: Q > 2: 


Por=Y; 


vari = (Ye) 
for limits ( let yg be the ordering function of (\, <3 9|Q]. 


/ 
’ 


For example, for ya := 1+ a we obtain pga = w? +a. If we start with 
pa := w®, then yia = €q and ye enumerates the critical e-numbers, i.e., 
the ordinals @ such that ¢g = a. 

LEMMA 7.5. Let 6 > 0. Then yg is the ordering function of the class of 
all common fixed points of all py for y < 8. 


PRoor. We must show yg[Q] = { € | Vy.7 < B > vyé = €}. 

C. This is proved by transfinite induction on @. In case @ + 1 every 
ye+i@ is a fixed point of yg and hence by IH also a fixed point of all yy for 
7 < GB. Tf 2 is a limit, then the claim follows from yg[Q] = jeg Py[Q. 

>. Let € such that Vy.y < 8 — y,€ = € be given. If @ is a successor, 
then € € yg[Q] by definition of yg. If 6 is a limit, then € € (),<g py[Q] = 
al]. 


It follows that y,(ygE) = ye€ for every 7 < f. 

A further normal function can be obtained as follows. From each of the 
normal classes yg[Q] pick the least fixed point. The class formed in this 
way again is normal, hence can be enumerated by a normal function. This 
normal function assigns to every @ the ordinal y,0. 

LEMMA 7.6. If y is a normal function with 0 < yO, then AB y,0 ts a 
normal function as well. 


PROOF. We first show 


B<7— 90 < 9,0, 
by induction on y. So let @ < y. Observe that 0 < yg0 by IH or in case 
2 = 0 by assumption. Hence 0 is not a fixed point of yg and therefore 
0 < y,0. But this implies yg0 < ye(y,0) = y,0. 
We now show that AGyg0 is continuous. Let 6 := supg<, 930 with 7 
limit. We must show 6 = y,0. Because of yg0 € Ya|Q] for alla < 6 < 
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and since Yq[Q] is closed we have 6 € Yq[Q], hence 6 € acy YalQ] = Yy[2 
and therefore 6 > y,0. On the other hand yg0 < ye(y,0) = v0, hence 
6 < y,0. 


The fixed points of this function, i.e., the ordinals @ such that y,0 = a, 
are called strongly critical ordinals. Observe that they depend on the given 
normal function y = yo. Their ordering function is usually denoted by I. 
Hence by definition 9 := [0 is the least ordinal 3 such that yg0 = (. 


7.3. y Normal Form. We now generalize Cantor’s normal form, using 
the Veblen hierarchy instead of w. 


LEMMA 7.7. 


ao < peo, if Po < Ai; 
(41) Pp. < Pp, <= ag < a1, af Bo = Pi; 
Pee <n, if o> fA, 
ao = 96,01, If Bo < i; 
(42) PBo%0 = 9p,01 = 4 a0 = a4, if Bo = Bi; 
6% = 24, If Bo > fi. 


PROOF. <. (41). If Go < G1 and ag < Y¢,a1, then Yg,a0 < Ya, 6,01 = 
ye,o1. If Bo = 6, and ag < a, then yg,a0 < yg,a1. If Bo > fi and 
YGo00 < a1, then Yg,A0 = $6, 8,00 < Yg,A1. For (42) one argues similarly. 

=>. If the right hand side of (41) is false, we have 


a1 < peo, if G1 < Go; 

ay < a0, if G1 = 80; 

p01 Sao, if G1 > Bo, 
hence by = (with 0 and 1 exchanged) yg,a1 < Ys a0 Or Yg,01 = %~,Q0, 
hence =(~g,a0 < yg,a1). If the right hand side of (42) is false, we have 

ao F pao, if Bo < Ar; 

ao # 1, if Go = G1; 

PBo@O0 # a1, if Bo > Br, 
and hence by < in (41) either pg,ao < yg,a1 or Yg,a1 < Y~ a0, hence 
PG .%0 F 98,1. 


COROLLARY 7.8. If 89 < (1, then yaa < yg,a. 


Proor. Assume (9 < (1. By Lemma 7.7 (for <) it suffices to show 
a < yg,a. But this follows from Lemma, 7.1. 


COROLLARY 7.9. If ~g,00 = 96,01, then ag = ay, and Bo = G1, provided 
ag < Yg A and ay < pg, 1. 


PROOF. Case (39 = 3;. Then ap = a, follows from Lemma 7.7. Case 
Bo < 6. By Lemma 7.7 we have ap = yg,Q1 = Y¢,%0, contradicting our 
assumption. Case (1 < (9. Similar. 
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COROLLARY 7.10. If y is a normal function with 0 < yO, then every 
fixed point a of p = yo can be written uniquely in the form a = ygal with 
al <a. 


PROOF. We have a+ 1 < 9% 41,0 by Lemma 7.6 and hence a < Ya41a. 
Now let @ be minimal such that a < yga. By assumption 0 < @. Since a 
is a fixed point of all y, with y < G, we have a = yga’ for some a’. Using 
a < vga it follows that a’ < a. 

Uniqueness. Let in addition a = yg,a, with a; < a. Then a; < yg,a, 
hence § < (, by the choice of G. Now if 6 < 6, then we would obtain 
yea = pee, = %g,01 = a, contradicting our choice of G. Hence 6 = (3; 
and therefore aj = a. 


We now show that every ordinal can be written uniquely in a certain y 
normal form. Here we assume that our initial normal function yo = y is the 
exponential function with base w. 

THEOREM 7.11 (y Normal Form). Let yo := wS. Then every ordinal a 
can be written uniquely in the form 


a= 9B,01 +++ + YB, On 


with pg,04 2 +++ > pe,An and a; < ys, fori =1,...,k. Ifa <I , then 
in addition we have 8; < pe,a; fori=1,...,n. 


PROOF. Existence. First write a in Cantor normal form a@ = yod1+---+ 
Yoon With 6, >--- > dy. Every summand with 6; < vod; is left unchanged. 
Every other summand satisfies 6; = yod; and hence by Corollary 7.10 can 
be replaced by pga’ where a! < vga’. 

Uniqueness. Let 


A= Pp 01 +++ + GB, An = Y_ray +--+ + YE, Un 


and assume that both representations are different. Since no such sum can 
extend the other, we must have 7 < n,m such that (G;,a;) 4 (Gi, a4). By 


47 
Lemma 6.1(d) we can assume i = 1. Now if say yg,a1 < Ya, then we 
would have (since yg a4 is an additive principal number and yg,a1 > --- > 


£8, 2n) 
3,01 +-+-+6,An < Yar ay < Yet OL aeeiee YBI,Amns 


a contradiction. 
We must show that in case a < 9 we have 3; < yg,a; fori = 1,...,n. 
So assume yg,a; < G; for some 7. Then 


ya < pe,ai < Bi < v,,9, 
hence yg,0 = 6; and hence 
To < & = 92,0 < yp, <a. 


From the yg(q@) one obtains a unique notation system for ordinals below 
Io := 0. Observe however that [9 = yp, 0. by definition of Tp. 
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8. Notes 


Set theory as presented in these notes is commonly called ZFC (Zermelo- 
Fraenkel set theory with the axiom of choice; C' for Choice). Zermelo wrote 
these axioms in 1908, with the exeptions of the Regularity Axioms (of von 
Neumann, 1925) and the replacement scheme (Fraenkel, 1922). Also Skolem 
considered principles related to these additional axioms. In ZFC the only 
objects are sets, classes are only a convenient way to speak of formulas. 

The hierarchy of normal functions defined in Section 7 has been extended 
by Veblen [29] to functions with more than one argument. Schiitte in [21] 
studied these functions carefully and could show that they can be used for a 
cosntructive representation of a segment of the ordinals far bigger than Tp. 
To this end he introduced so-called inquotesKlammersymbole to denote the 
multiary Veblen-functions. 

Bachmann extended the Veblen hierarchy using the first uncountable 
ordinal 2. His approach has later been extended by means of symbols for 
higher number classes, first by Pfeiffer for finite number clasees and then 
by Isles for transfinite number classes. However, the resulting theory was 
quite complicated and difficult to work with. An idea of Feferman then 
simplified the subject considerably. He introduced functions 0,: On — On 
for a € On, that form again a hierarchy of normal functions and extend the 
Veblen hierarchy. One usually writes 6a@ instad of 6,(@) and views 6 as a 
binay function. The ordinals 0a can be defined byransfinite recursion on a, 
as follows. Assume thatd 6¢ for every € < a is defined already. Let C(a, 3) 
be the set of all ordinals that can be generated from ordinals < ( and say the 
constants 0,N1,...,%. by means of the functions + and 6}{ € | € < a@}xOn. 
An ordinal @ is aclled a-critical if G6 ¢ C(a,8). Than 0,: On — On is 
defined as the ordering function of the class of all a-critical ordinals. 

Buchholz observed in [4] that the second argument ( in 6a@ is not used 
in any essential way, and that the functions a> daX, with v = 0,1,...,w 
generate a notation system for ordinals of the same strength as the system 
with the binary 6-function. He then went on and defined directly functions 
Wy with v < w, that correspond to a ++ daX,. More precisely he defined 
Wyo for a € On and v < w by transfinite recursion on a (simultaneously for 
all v), as follows. 

pya = min{y | y ¢ Co(a) }, 
where C,,(q) is the set of all ordinals that can be generated from the ordinals 
<, by the functions + and all wy [{€|& <a} with u<w. 


CHAPTER 6 
Proof Theory 


This chapter presents an example of the type of proof theory inspired 
by Hilbert’s programme and the Gédel incompleteness theorems. The prin- 
cipal goal will be to offer an example of a true mathematically meaningful 
principle not derivable in first-order arithmetic. 

The main tool for proving theorems in arithmetic is clearly the induction 
schema 

A(0) — (Va.A(ax) — A(S(x))) — Vr A(z). 
Here A(x) is an arbitrary formula. An equivalent form of this schema is 
“cumulative” or course-of-values induction 


(Va. Vy<a2A(y) > A(x)) — VrA(z). 


Both schemes refer to the standard ordering of the natural numbers. Now 
it is tempting to try to strengthen arithmetic by allowing more general 
induction schemas, e.g. with respect to the lexicographical ordering of Nx N. 
More generally, we might pick an arbitrary well-ordering < over N and use 
the schema of transfinite induction: 


(VaVy~ax A(y) — A(x)) — VarA(z). 


This can be read as follows. Suppose the property A(x) is “progressive”, 
i.e. from the validity of A(y) for all y < x we can always conclude that A(x) 
holds. Then A(a) holds for all x. 

One might wonder whether this schema of transfinite induction actually 
strengthens arithmetic. We will prove here a classic result of Gentzen [9] 
which in a sense answers this question completely. However, in order to 
state the result we have to be more explicit about the well-orderings used. 
This is done in the next section. 


1. Ordinals Below ¢€o 


In order to be able to speak in arithmetical theories about ordinals, we 
use use a Gédelization of ordinals. This clearly is possible for countable 
ordinals only. Here we restrict ourselves to a countable set of relatively 
small ordinals, the ordinals below ¢9. Moreover, we equip these ordinals 
with an extra structure (a kind of algebra). It is then customary to speak 
of ordinal notations. These ordinal notations could be introduced without 
any set theory in a purely formal, combinatorial way, based on the Cantor 
normal form for ordinals. However, we take advantage of the fact that we 
have just dealt with ordinals within set theory. We also introduce some 
elementary relations and operations for such ordinal notations, which will 
be used later. For brevity we from now on use the word “ordinal” instead 
of “ordinal notation”. 
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1.1. Comparison of Ordinals; Natural Sum. 
LEMMA 1.1. Let w™ +---+w% and wn +---4+w be Cantor normal 
forms (with m,n > —1). Then 
Wyrm 402. 4 yy < wen 4... 4 yyho 


iff there is ani > 0 such that Am—i < Bri; Am—it+1 = Pn—i41,---;Q@m = Bn, 
orm <n and am = Bn,---,0 = Bn—m- 


PROOF. Exercise. 


We use the notations 1 for w°, a for w? + --- + w® with a copies of w® 
and wa for w® +--++w® again with a copies of w®. 

LEMMA 1.2. Let w®™ +---+w% and w +---+w% be Cantor normal 
forms. Then 


wom ae yr 1 yGn Le ava cof 0 = yom eer wrt | yon fesecaee a] 0 | 


where i is minimal such that a; > Bn; if there is no such i, leti =m-+1 (so 
yyhn feeet wo )., 


PROOF. Exercise. 


One can also define a commutative variant of addition. This is the so- 
called natural sum or Hessenberg sum of two ordinals. For Cantor normal 


forms w%™ +--+ +w% and w9 +..-+.w it is defined by 
(wo eee s + wy) 4H (wn ed ww) = yyWmtnti Hosein ge, 
where Ymin+i,---,Yo is a decreasing permutation of A@m,...,Q@0,8n,---,8o- 


LEMMA 1.3. # is associative, commutative and strongly monotonic in 
both arguments. 


PROOF. Exercise. 


1.2. Enumerating Ordinals. In order to work with ordinals in a 
purely arithmetical system we set up some effective bijection between our 
ordinals < ¢9 and non-negative integers (i.e., a Gédel numbering). For its 
definition it is useful to refer to ordinals in the form 


whim +++ +w% kg with am >-+: > ag and ki #0 (m>-—1). 


(By convention, m = —1 corresponds to the empty sum.) 
For every ordinal a we define its G6del number "a! inductively by 


ki 
Pw Koy Rt Ww kg | = (II DB») i 1, 
i<m 
where pry, is the n-th prime number starting with po := 2. For every non- 
negative integer x we define its corresponding ordinal notation o(x) induc- 


tively by 
o( ([] 7") = 1) = Sw a,, 


i<l i<l 
where the sum is to be understood as the natural sum. 
LEMMA 1.4. (a) o("a") =a, 
(b) "o(a) 1 =a. 
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PRooF. This can be proved easily by induction. 


Hence we have a simple bijection between ordinals and non-negative 
integers. Using this bijection we can transfer our relations and operations 
on ordinals to computable relations and operations on non-negative integers. 
We use the following abbreviations. 


UX<Y = o(x) < o(y), 
Ww” = Fyole) 7, 

rey = 'o(x)+oly)’, 
xk = To(ajk), 

Wk = Twp |, 


where wo := 1, wry = w*. 
We leave it to the reader to verify that ~, AXv.w”, Axy.c By, Avxk.xk and 
Ak." wy | are all primitive recursive. 


2. Provability of Initial Cases of TI 


We now derive initial cases of the principle of transfinite induction in 
arithmetic, i.e., of 


(VaVy~<a Py —- Px) — Va~<a Px 


for some number a and a predicate symbol P, where ~ is the standard order 
of order type €o defined in the preceeding section. In a later section we will 
see that our results here are optimal in the sense that for the full system of 
ordinals < ¢9 the principle 


(Va.Vy~<x Py > Px) > VaPx 


of transfinite induction is underivable. All these results are due to Gentzen 
[9]. 


2.1. Arithmetical Systems. By an arithmetical system Z we mean 
a theory based on minimal logic in the V—1-language (including equality 
axioms), with the following properties. The language of Z consists of a fixed 
(possibly countably infinite) supply of function and relation constants which 
are assumed to denote fixed functions and relations on the non-negative 
integers for which a computation procedure is known. Among the function 
constants there must be a constant S for the successor function and 0 for 
(the 0-place function) zero. Among the relation constants there must be 
a constant = for equality and ~ for the ordering of type eo of the natural 
numbers, as introduced in Section 1. In order to formulate the general 
principle of transfinite induction we also assume that a unary relation symbol 
P is present, which acts like a free set variable. 

Terms are built up from object variables x, y, z by means of f(t1,...,tm), 
where f is a function constant. We identify closed terms which have the 
same value; this is a convenient way to express in our formal systems the as- 
sumption that for each function constant a computation procedure is known. 
Terms of the form S(S(...S(0)...)) are called numerals. We use the nota- 
tion S”"0 or 7 or (only in this chapter) even n for them. Formulas are built 
up from L and atomic formulas R(t,...,tm), with Ra relation constant or 
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a relation symbol, by means of A — B and VzA. Recall that we abbreviate 
A- 1 by 7AA. 

The azioms of Z will always include the Peano axioms, i.e., the universal 
closures of 


(43) S(@) = S(y) > a= y, 
(44) S(z) =0- A, 
(45) A(0) > (Vx. A(x) — A(S(x))) — Vr A(x), 


with A(z) an arbitrary formula. We express our assumption that for every 
relation constant R a decision procedure is known by adding the axiom Rr 
whenever Rn is true, and ~Rn whenever Rn is false. Concerning < we 
require irreflexivity and transitivity for < as axioms, and also — following 
Schiitte — the universal closures of 


(46) x<0-A, 

(47) z<~y@uw 3 (z~y— A) (z =y 9 A) OA, 
(48) x@0=a, 

(49) tO (y@z) = (e@@y) Oz, 

(50) 0OGx=a, 

(51) w"0 =0, 

(52) w"S(y) =w*y Ow”, 

(53) z<y@uw 5 z~y@ wl) m(ax, y, 2), 

(54) z<y@u) — e(x,y,z) ~ S(2), 


where 9, Axry.w*y, e and m denote the appropriate function constants and 
A is any formula. (The reader should check that e, m can be taken to be 
primitive recursive.) These axioms are formal counterparts to the properties 
of the ordinal notations observed in the preceeding section. We also allow 
an arbitrary supply of true formulas V#A with A quantifier-free and without 
P as axioms. Such formulas are called Il,-formulas (in the literature also 
I-formulas). 

Moreover, we may also add an ex-falso-quodlibet schema Efq or even a 
stability schema Stab for A: 


Va.l — A, 
Va.n7A = A. 
Addition of Efq leads to an intuitionistic arithmetical system (the V—>1- 
fragment of of Heyting arithmetic HA), and addition of Stab to a classical 
arithmetical system (a version of Peano arithmetic PA). Note that in our 
V—L-fragment of minimal logic these schemas are derivable from their in- 
stances 
Vi.l — Rez, 
Vti.77R&t > Rz, 


with R a relation constant or the special relation symbol P. Note also that 
when the stability schema is present, we can replace (44), (46) and (47) by 
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their more familiar classical versions 


(55) S(x) £0, 
(56) x A0, 
(57) z~yew oz4yorsy. 


We will also consider restricted arithmetical systems Z,. They are defined 
like Z, but with the induction schema (45) restricted to formulas A of level 
lev(A) < k. The level of a formula A is defined by 


lev(Rt) := lev(L) :=0, 
lev(A — B) := max(lev(A) + 1, lev(B)), 
lev(VaA)  :=max(1,lev(A)). 


However, the trivial special case of induction A(0) — VxrA(Sxr) — VxA, 
which amounts to case distinction, is allowed for arbitrary A. (This is needed 
in the proof of Theorem 2.2 below) 


2.2. Gentzen’s Proof. 


THEOREM 2.1 (Provable Initial Cases of TI in Z). Transfinite induction 
up to Wp, t.e., for arbitrary A(x) the formula 


(Va.Vy~<x A(y) > A(x)) — Vaxw, A(x), 
is derivable in Z. 


PRooF. To every formula A(x) we assign a formula At (x) (with respect 
to a fixed variable x) by 


At (x) := (Wy.Vz<y A(z) — Vz<y @w" A(z)). 
We first show 
If A(x) is progressive, then At (zx) is progressive, 


where “B(x) is progressive” means Vz. Vy<x B(y) > B(x). So assume that 
A(x) is progressive and 


(58) Vy<a At (y). 
We have to show At*(a). So assume further 
(59) Ve~<y A(z) 


and z < y@w*. We have to show A(z). 

Case x = 0. Then z < y@w®. By (47) it suffices to derive A(z) from 
z ~< yas well as from z = y. If z < y, then A(z) follows from (59), and if 
z= y, then A(z) follows from (59) and the progressiveness of A(x). 

Case Sx. From z < y @ w®* we obtain z < y @ w°4) m(a, y, z) by 
(53) and e(x,y,z) < Sax by (54). From (58) we obtain At(e(z,y,z)). By 
the definition of At (x) we get 


Vuxy @ wl®¥)y A(u) = Vux(y @ wl) y) @ wl) A(u) 
and hence, using (49) and (52) 
Vuxy © wy A(u) > Vury @ w42)S(v) A(u). 
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Also from (59) and (51), (48) we obtain 
Vuxy @ w(?¥-)0 A(u). 
Using an appropriate instance of the induction schema we can conclude 
Vuxy © w*) m(x, y, z) A(u) 
and hence A(z). 


We now show, by induction on n, how for an arbitrary formula A(x) we 
can obtain a derivation of 


(Va.Vy~ax A(y) > A(x)) — VaX<w,, A(z). 


So assume the left hand side, i.e., assume that A(a) is progressive. 

Case 0. Then x < w® and hence x < 0 @w” by (50). By (47) it suffices 
to derive A(x) from x ~ 0 as well as from x = 0. Now x < 0 > A(z) holds 
by (46), and A(0) then follows from the progressiveness of A(x). 

Case n+ 1. Since A(x) is progressive, by what we have shown above 
At(z) is also progressive. Applying the IH to At (zx) yields Vr<w, AT (2), 
and hence A*(w,,) by the progressiveness of A*(a). Now the definition of 
At (x) (together with (46) and (50)) yields Vz<w*" A(z). 


Note that in the induction step of this proof we have derived transfinite 
induction up to wp+41 for A(x) from transfinite induction up to w, for a 
formula of level higher than the level of A(x). 

We now want to refine the preceeding theorem to a corresponding result 
for the subsystems Z, of Z. 

THEOREM 2.2 (Provable Initial Cases of Tl in Z;,). Let 1<1I<k. Then 
in Z, we can derive transfinite induction for any formula A(x) of level < 1 
up to We-142|m] for arbitrary m, i.e. 


(Va.Vy~<x A(y) > A(x)) = VaX<wpz_i42[m] A(x), 


where w1[m] := m, wj41[m] = wil, 


ProorF. Note first that if A(x) is a formula of level / > 1, then the 
formula A*(x) constructed in the proof of the preceeding theorem has level 
1+ 1, and for the proof of 

If A(x) is progressive, then At (zx) is progressive, 


we have used induction with an induction formula of level J. 

Now let A(x) be a fixed formula of level < /, and assume that A(z) is 
progressive. Define A° := A, A‘t! := (A*)+. Then lev(A*) < 1+i, and hence 
in Zz we can derive that A‘, A?,...A®~'+! are all progressive. Now from 
the progressiveness of A*—'*1(x) we obtain A*—'+1(0), AF-'4+1(1), A412) 
and generally A*—'+1(m) for any m, i.e., A*~'+!(w1[m]). But since 

AR“ 1 (9) = (AP!) + (x) = Vy(We~y AP '(z)  Vz~y @ w® A®!(z)) 
we first get (with y = 0) Vz<w[m] A*~'(z) and then A*~'(we[m]) by the 
progressiveness of A*~!. Repeating this argument we finally obtain 


Vz <Wk—1+42 [m] A° (z) 2 


This concludes the proof. 
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Our next aim is to prove that these bounds are sharp. More precisely, we 
will show that in Z (no matter how many true II,-formulas we have added 
as axioms) one cannot derive “purely schematic” transfinite induction up to 
Eg, 1.e., one cannot derive the formula 


(Va.Vy<x Py > Px) > VaPx 


with a relation symbol P, and that in Z, one cannot derive transfinite 
induction up to wz+1, ie., the formula 


(VaVy~x Py > Px) > VarX<wpz41 Px. 


This will follow from the method of normalization applied to arithmetical 
systems, which we have to develop first. 


3. Normalization with the Omega Rule 


We will show in Theorem 4.7 that a normalization theorem does not hold 
for arithmetical systems Z, in the sense that for any formula A derivable in Z 
there is a derivation of the same formula A in Z which only uses formulas of 
a level bounded by the level of A. The reason for this failure is the presence 
of induction axioms, which can be of arbitrary level. 

Here we remove that obstacle against normalization in a somewhat dras- 
tic way: we leave the realm of proofs as finite combinatory objects and 
replace the induction axiom by a rule with infinitely many premises, the so- 
called w-rule (suggested by Hilbert and studied by Lorenzen, Novikov and 
Schiitte), which allows us to conclude Vx A(x) from A(0), A(1), A(2),..., ie. 


do dy d; 
A(0) A(1) ~«e Ald) 
Va A(x) 


So derivations can be viewed as labelled infinite (countably branching) trees. 
As in the finitary case a label consists of the derived formula and the name of 
the rule applied. Since we define derivations inductively, any such derivation 
tree must be well-founded, i.e., must not contain an infinite descending path. 

Clearly this w-rule can also be used to replace the rule Vtz. As a con- 
sequence we do not need to consider free individual variables. 

It is plain that every derivation in an arithmetical system Z can be 
translated into an infinitary derivation with the w-rule; this will be carried 
out in Lemma 3.3 below. The resulting infinitary derivation has a notewor- 
thy property: in any application of the w-rule the cutranks of the infinitely 
many immediate subderivations d,, are bounded, and also their sets of free 
assumption variables are bounded by a finite set. Here the cutrank of a 
derivation is as usual the least number > the level of any subderivation 
obtained by —* as the main premise of >~ or by the w-rule as the main 
premise of V~, where the level of a derivation is the level of its type as a 
term, i.e., of the formula it derives. Clearly a derivation is called normal 
iff its cutrank is zero, and we will prove below that any (possibly infinite) 
derivation of finite cutrank can be transformed into a derivation of cutrank 
zero. The resulting normal derivation will continue to be infinite, so the 
result may seem useless at first sight. However, we will be able to bound the 
depth of the resulting derivation in an informative way, and this will enable 


€ 
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us in Section 4 to obtain the desired results on unprovable initial cases of 
transfinite induction. Let us now carry out this programme. 

N.B. The standard definition of cutrank in predicate logic measures the 
depth of formulas; here one uses the level. 


3.1. Infinitary Derivations. The systems Z™ of w-arithmetic are de- 
fined as follows. Z°° has the same language and — apart from the induction 
axioms — the same axioms as Z. Derivations in Z° are infinite objects. It 
is useful to employ a term notation for these, and we temporarily use d, e, f 
to denote such (infinitary) derivation terms. For the term corresponding to 
the deduction obtained by applying the w-rule to d;, i € N we write (di)icw. 
However, for our purposes here it suffices to only consider derivations whose 
depth is bounded below €o. 

We define the notion “d is a derivation of depth < a” (written |d| < a) 
inductively as follows (i ranges over numerals). 


(A) Any assumption variable u4 with A a closed formula and any axiom 
Ax“ is a derivation of depth < a, for any a. 
(—*) If d? is a derivation of depth < ag < a, then (Au4.d?)47? is a 
derivation of depth < a. 
(=~) If d4~* and e4 are derivations of depths < a; < a (i=1,2), then 
(d4> eA)® is a derivation of depth < a. 
(w) For all A(z), if at) are derivations of depths < a; < a (i < w), 
then (AM), y)¥PA is a derivation of depth < a. 
(V-) For all A(x), if d’”4@ is a derivation of depth < ag < a, then, for 
all i, (d’*4@{)4© is a derivation of depth < a. 
We will use |d| to denote the least a such that |d| < a. 

Note that in (V~) it suffices to use numerals as minor premises. The 
reason is that we only need to consider closed terms, and any such term is 
in our setup identified with a numeral. 

The cutrank cr(d) of a derivation d is defined by 


cr(u“) Sere 0, 
cr(Aud) := cr(d), 
cr(d4~BeA) := max(lev(A > B),cr(d),cr(e)), if d= Aud’, 
max(cr(d), cr(e)), otherwise, 
cr((dj)icw) := supcr(d;), 
i<w 
or(d¥®A() 5) = max(lev(V7A(x)),cr(d)), if d= (di)icu, 
cr(d), otherwise. 


Clearly cr(d) € NU {w} for all d. For our purposes it will suffice to consider 
only derivations with finite cutranks (i.e., with cr(d) € N) and with finitely 
many free assumption variables. 


LEMMA 3.1. If d is a derivation of depth < a, with free assumption 
variables among u,ti and of cutrank cr(d) = k, and e is a derivation of 
depth < 3, with free assumption variables among % and of cutrank cr(e) = l, 
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then dlu := e] is a derivation with free assumption variables among t, of 
depth |diu := e]| < G+ a and of cutrank cr(d[u := e]) < max(lev(e), k,l). 


PROOF. Straightforward induction on the depth of d. 


Using this lemma we can now embed our systems Zy, (i.e., arithmetic 
with induction restricted to formulas of level < k) and hence Z into Z°°. In 
this embedding we refer to the number n;(d) of nested applications of the 
induction schema within a Z;-derivation d. 

The nesting of applications of induction in d, n;(d), is defined by induc- 
tion on d, as follows. 


ny(u) := n,(Ax) := 0, 

ny(Ind) :=1, 

nz (Ind tde) := max(n7(d),n7(e) + 1), 

n1(de) := max(n;(d),nz(e)), if d is not of the form Ind tdo, 
ny(Aud)  := ny (Ard) := ny(dt) = ny(d). 


3.2. Long Normal Form. For the next lemma we need the notion of 
the long normal form of a derivation. In Subsection 3.7 of Chapter 1 we 
have studied the form of normal derivations in minimal logic. We considered 
the notion of a track and observed, that in every track all elimination rules 
precede all introduction rules, and that in a uniquely determined minimal 
node we encounter a minimal formula, that is a subformula of any formula 
in the elimination part as well as in the introduction part of the track. In 
the notion of a long normal form we additionally require that every minimal 
formula is atomic. 

For simplicity we restrict ourselves to the —-fragment of minimal propo- 
sitional logic; however, our considerations are valid for the full language as 
well. 

For terms of the typed A-calculus we define the 7-expansion of a variable 
by 
Pe) — ; 


nv (27 **) = Az any (2) 


so by induction on the type of the variable. The -expansion of a term can 
then be defined by induction on terms: 

n(rg-(@M)*~*) := dy, Z.2n(M)ny (2). 
Note that we always have (2) = ny (x). — Hence clearly: 


LEMMA 3.2. Every term can be transformed into long normal form, by 
first normalizing and then n-expanding it. 


3.3. Embedding of Z,. 


LEMMA 3.3. Let a Zy-derivation in long normal form be given with < m 
nested applications of the induction schema, i.e., of 


A(0) — (Vz.A(x) — A(Sx)) — VrA(a), 
all with lev(A) < k. We consider subderivations d? not of the form Ind t or 


Indtdy. For every such subderivation and closed substitution instance Bo of 
B we construct (d%°)?° in Z° with free assumption variables ul? for uf 
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free assumption of d, such that |d%°| < w™*! and cr(d&) < k, and moreover 
such that d is obtained by >* iff d&° is, and d is obtained by V* or of the 
form \nd tdoe iff d&° is obtained by the w-rule. 


ProoF. By recursion on such subderivations d. 
Case u© or Ax. Take u©? or Ax. 
Case Ind tde’. Since the deduction is in long normal form, e! = \zxv.e. 

By IH we have d3 and e3°. (Note that neither d nor e can have one of the 

forbidden forms Ind t and Ind tdg, since both are in long normal form). Write 

eX (t, f) for e |r, uv :=t, f], and let 
(Ind td(Aav.e))& := (d&, e&(0, d&°), eS (1, €& (0, d&)),...). 

By IH |e%| < w™!-p and |d®| < w™-q for some p,q < w. By Lemma 3.1 
we obtain 

le> (0, dz )| <w™-q + wp, 

le? (1, €9°(0, dg°))| < wg +w™*-2p 
and so on, and hence 

|(Ind d(Axv.e))S°| < w™-(q4+ 1). 

Concerning the cutrank we have by IH cr(d?°), cr(es°) < k. Therefore 
cr(e(0, d)) < max(lev(A(0)), er(d2), er(e®)) < hy 
cr(ee?(1, es? (0, de°))) < max(lev(A(1)), &, cr(es°)) =k, 

and so on, and hence 


cr((Ind d(Arv.e))3°) < k. 


Case \u°.d?. By IH, we have (d%)?? with possibly free assumptions 
uo? Take: (Aud)? 2= \ul?.d&: 

Case de, with d not of the form Ind¢ or Indtdg. By IH we have des 
and e°. Since de is subderivation of a normal derivation we know that d 
and hence also d& is not obtained by +*. Therefore (de)%° := d®e% is 
normal and cr(d%eS) = max(cr(d), cr(e&)) < k. Also we clearly have 
fares |r, 

Case (\x.d)¥*?™), By IH for every i and substitution instance B(i)o 
we have dX;. Take (Az.d)3° := (d¥)icw. 

Case (dt)?!"=4. By IH, we have (d%°)"8)7, Let j be the numeral 
with the same value as to. If d3° = (d;)i<y (which can only be the case 
if d = Indtdgeg, for dt is a subderivation of a normal derivation), take 
(dt) := d;. Otherwise take (dt)?° := d3°j 


3.4. Normalization for Z°. A derivation is called convertible or a 
redex if it is of the form (Au.d)e or else (d;);<yj, which can be converted 
into d{u := e] or d;, respectively. A derivation is called normal if it does not 
contain a convertible subderivation. Note that a derivation is normal iff it 
is of cutrank 0. 

Call a derivation a simple application if it is of the form dod, ... dy, with 
dg an assumption variable or an axiom. 

We want to define an operation which by repeated conversions trans- 
forms a given derivation into a normal one with the same end formula and 
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no additional free assumption variables. The usual methods to achieve such 
a task have to be adapted properly in order to deal with the new situation 
of infinitary derivations. Here we give a particularly simple argument due 
to Tait [26]. 

LEMMA 3.4. For any derivation d4 of depth < a and cutrank k +1 we 


can find a derivation (ae with free assumption variables contained in those 
of d, which has depth < 2° and cutrank < k. 


ProoF. By induction on a. The only case which requires some ar- 
gument is when the derivation is of the form de with |d| < a, < a and 
le] < ag < a, but is not a simple application. We first consider the sub- 
case where d* = Xu.di(u) and lev(d) = k +1. Then lev(e) < k by the 
definition of level (recall that the level of a derivation was defined to be 
the level of the formula it derives), and hence d,[u := e*] has cutrank < k 
by Lemma 3.1. Furthermore, also by Lemma 3.1, d,[u := e*] has depth 
< 2024901 < gmax(a2.01)+l < 20° Hence we can take (de)* to be dy[u := ef]. 

In the subcase where d* = (dj)icw, lev(d) = k +1 and e* = j we can 
take (de)* to be dj, since clearly d; has cutrank < k and depth < 2°. If 
we are not in the above subcases, we can simply take (de)* to be d*e*. 
This derivation clearly has depth < 2°. Also it has cutrank < k, which can 
be seen as follows. If lev(d) < & +1 we are done. But lev(d) > k + 2 is 
impossible, since we have assumed that de is not a simple application. In 
order to see this, note that if de is not a simple application, it must be of 
the form dod,...dne with dp not an assumption variable or axiom and do 
not itself of the form d’d’; then dy must end with an introduction —* or 
w, hence there is a cut of a degree exceeding & + 1, which is excluded by 
assumption. 


As an immediate consequence we obtain: 


THEOREM 3.5 (Normalization for Z®). For any derivation d4 of depth 
<a and cutrank < k we can find a normal derivation (a4 with free as- 
sumption variables contained in those of d, which has depth < 27, where 
29 = a, 2a = Q?m , 

As in Section 3.7 of Chapter 1 we can now analyze the structure of 
normal derivations in Z°°. In particular we obtain: 


THEOREM 3.6 (Subformula Property for Z~). Let d be a normal deduc- 
tion in Z° for A. Then each formula in d is a subformula of a formula 
in TU {A}. 


PROOF. We prove this for tracks of order n, by induction on n. 


4. Unprovable Initial Cases of Transfinite Induction 


We now apply the technique of normalization for arithmetic with the 
w-rule to obtain a proof that transfinite induction up to €g is underivable in 
Z, i.e., a proof of 


ZY (Va.Vy~<x Py — Px) — VaPx 
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with a relation symbol P, and that transfinite induction up to wz41 is unde- 
rivable in Zz, i.e., a proof of 


Zr 4 (Va Vy~a Py > Px) 3 VarXwp,41 Px. 


It clearly suffices to prove this for arithmetical systems based on classical 
logic. Hence we may assume that we have used only the classical versions 
(55), (56) and (57) of the axioms from Subsection 2.1. 

Our proof is based on an idea of Schiitte, which consists in adding a 
so-called progression rule to the infinitary systems. This rule allows us to 
conclude Pj (where j is any numeral) from all Pi for i ~ j. 


4.1. Progression Rule. More precisely, we define the notion of a 
derivation in Z° + Prog(P) of depth < a by the inductive clauses above 
and the additional clause Prog(P): 

(Prog) If for all i < j we have derivations d?’’ of depths < a; < a, then 
ee is a derivation of depth < a. 
We also define cr((d;)i23) := sup;., er(di). 

Since this progression rule only deals with derivations of atomic formulas, 
it does not affect the cutranks of derivations. Hence the proof of normal- 
ization for Z° carries over unchanged to Z™ + Prog(P). In particular we 
have 

LEMMA 4.1. For any derivation d4 in Z© + Prog(P) of depth < a and 
cutrank <k+1 we can find a derivation (d®)A in Z° + Prog(P) with free 
assumption variables contained in those of d, which has depth < 2° and 
cutrank <k. 

We now show that from the progression rule for P we can easily derive 
the progressiveness of P. 

LEMMA 4.2. We have a normal derivation of VaNy~<x Py — Px in 
Z™ + Prog(P) with depth 5. 


PROOF. 
Vy<j Py __ 
inp SPiN PG. 
Pi = linxj 
i (all i x 7) oe 
Pj 
Vy<j Py — Pj oats (all J) 


Va. Vy~<x Py — Px 


4.2. Quasi-Normal Derivations. The crucial observation now is that 
a normal derivation of P'G™ must essentially have a depth of at least (. 
However, to obtain the right estimates for the subsystems Z, we cannot 
apply Lemma 4.1 down to cutrank 0 (i.e., to normal form) but must stop 
at cutrank 1. Such derivations, i.e., those of cutrank < 1, will be called 
quasi-normal; they can also be analyzed easily. 

We begin by showing that a quasi-normal derivation of a quantifier-free 
formula can always be transformed without increasing its cutrank or its 
depth into a quasi-normal derivation of the same formula which 
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(1) does not use the w-rule, and 
(2) contains V~ only in the initial part of a track starting with an 
axiom. 


Recall that our axioms are of the form VZA with A quantifier-free. 
The quasi-subformulas of a formula A are defined by the clauses 


(a) A,B are quasi-subformulas of A — B; 
(b) A(é) is a quasi-subformula of VxA(z), for all numerals 3; 
(c) If A is a quasi-subformula of B, and C is an atomic formula, then C — A 
and VA are quasi-subformulas of B; 
(d) “...is quasi-subformula of...” is a reflexive and transitive relation. 
For example, Q — Vx.P — A, P,Q atomic, is a quasi-subformula of 
A— B. 
We now transfer the subformula property for normal derivations (Theo- 
rem 3.6) to a quasi-subformula property for quasi-normal derivations. 
THEOREM 4.3 (Quasi-Subformula Property). Let d be a quasi-normal 
deduction in Z° + Prog(P) for A. Then each formula in d is a quasi- 
subformula of a formula in TU {A}. 


PROOF. We prove this for tracks of order n, by induction on n. 


CoROLLARY 4.4. Let d be a quasi-normal deduction in Z° + Prog(P) 
of a formula VA with A quantifier-free from quantifier-free assumptions. 
Then any track in d of positive order ends with a quantifier-free formula. 


ProoF. If not, then the major premise of the -~ whose minor premise 
is the offending end formula of the track, would contain a quantifier to the 
left of —. This contradicts Theorem 4.3. 


4.3. Elimination of the Omega Rule. Our next aim is to eliminate 
the w-rule. For this we need the notion of an instance of a formula, defined 
by the following clauses. 


(a) If B’ is an instance of B and A is quantifier-free, then A — B’ is an 
instance of A — B; 

(b) A(é) is an instance of Va A(x), for all numerals i; 

(c) The relation “...is an instance of ...” is reflexive and transitive. 


LEMMA 4.5. Let d be a quasi-normal deduction in Z° + Prog(P) of 
a formula A without V to the left of — from quantifier-free assumptions. 
Then for any quantifier-free instance A’ of A we can find a quasi-normal 
derivation d' of A’ from the same assumptions such that 


(a) d’ does not use the w-rule, 
(b) d’ contains Y~ only in the initial elimination part of a track starting 
with an axiom, and 


(c) |d'| < |dl. 


PROOF. By induction on the depth of d. We distinguish cases according 
to the last rule in d. 
Case —>~. 


152 6. PROOF THEORY 


By the quasi-subformula property A must be quantifier-free. Let B’ be a 
quantifier-free instance of B. Then by definition A — B’ is a quantifier-free 
instance of A — B. The claim now follows from the IH. 

Case —?. 


ae: eee 
A->~B™~ 
Any instance of A — B has the form A — B’ with B’ an instance of B. 
Hence the claim follows from the IH. 

Case V~. 


Va A(x) i 
A(i) 
Then any quantifier-free instance of A(z) is also a quantifier-free instance of 


VaA(x), and hence the claim follows from the IH. 
Case w. 


A(i) ...  (alli<w) ye 

Va A(x) 
Any quantifier-free instance of Vz A(x) has the form A(i)’ with A(2)’ a quan- 
tifier-free instance of A(i). Hence the claim again follows from the IH. 


A derivation d in Z°° + Prog(P) is called a Pa, =P B-refutation if @ and 
G are disjoint and d derives a formula A > B := A; > --- > Ay > B with 
A and the free assumptions in d among P'ayz',...,P"Qm',7P' B,',..., 
AP" 6,7 or true quantifier-free formulas without P, and B a false quantifier- 
free formula without P or else among P' 6,1,...,P" By |. 

(So, classically, a Pa, =P -refutation shows I; Poa W; £05) 


LEMMA 4.6. Let d be a quasi-normal P&, PB-refutation. Then 
min(8) < |a| + Ih(@’), 


> 


where a’ is the sublist of @ consisting of all a; < min(3), and |h(a’) denotes 
the length of the list a’. 


Proor. By induction on |d|. By the Lemma above we may assume that 
d does not contain the w-rule, and contains V~ only in a context where 
leading universal quantifiers of an axiom are removed. We distinguish cases 
according to the last rule in d. 

Case —*. By our definition of refutations the claim follows immediately 
from the IH. 5 

Case —~. Then d= fora nee, If C is a true quantifier-free formula 
without P or of the form P"y" with 7 < min(@), the claim follows from the 
IH for f: 

min(3) < |f| + Ih(a@’) + 1 < |d| + Ih(a’). 

If C' is a false quantifier-free formula without P or of the form P'y’ with 


> 


min((@) < ¥, the claim follows from the IH for e: 
min(@) < Je] + Ih(a@’) +1 < |d| + Ih(a’). 


It remains to consider the case when C is a quantifier-free implication in- 
volving P. Then lev(C’) > 1, hence lev(C — (A — B)) > 2 and therefore 
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(since cr(d) < 1) f must be a simple application starting with an axiom. 
Now our only axioms involving P are Eqp: Vz,y.c = y — Px — Py and 
Stabp colonVz..7Px — Px, and of these only Stabp has the right form. 
Hence f = Stabp"y' and therefore e: ->P"y". Now from lev(=7P'y") = 2, 
the assumption cr(e) < 1 and again the form of our axioms involving P, it 
follows that e must end with -7, i.e., e = a ie So we have 


[u: -Pry)] 
i - 
elt 
anPlyt— Pry aaPry7 


Pry) 
The claim now follows from the IH for eo. 

Case V-. By assumption we then are in the initial part of a track 
starting with an axiom. Since d is a Pa, =P -refutation, that axiom must 
contain P. It cannot be the equality axiom Eqp: Vz, y.c = y — Px — Py, 
since “y! =")! — Ply! — P'S" can never be (whether 7 = 6 or y # 6) 
the end formula of a Pa, 4 P-refutation. For the same reason it can not 
be the stability axiom Stabp: Vz..7Px — Px). Hence the case Y~ cannot 


occur. 


Case Prog(P). Then d = Cae By assumption on d, ¥ is in 


B. We may assume y = (; := min(@), for otherwise the premise deduction 
dg,: P' 8; ' would be a quasi-normal Pa, =P§-refutation, to which we could 
apply the IH. 

If there are no a; < y, then the argument is simple: every ds is a 
Pa, =P£, —=P6-refutation, so by IH, since also no a; < 4, 


min(, 6) = 6 < dp(ds), 


=> 


hence y = min() < |d]. 

To deal with the situation that some a; are less than 7, we observe that 
there can be at most finitely many a; immediately preceeding ¥; so let ¢ be 
the least ordinal such that 


Voe<d<yroded. 


Then ¢,e4+1,...,.e+k-1¢€a,¢+k=7y. We may assume that ¢ is either 
a successor or a limit. If ¢ = e’ + 1, it follows by the IH that since d,: is a 
Pda, 7~P6,-P(é — 1)-refutation, 


e—1<dp(de_1) + lh(a’) — k, 
where @’ is the sequence of a; < y. Hence ¢ < |d| + |h(@’) — k, and so 
y < |d| + Ih(a’). 


If € is a limit, there is a sequence (d¢(n))n With limit ¢, and with all aj < € 
below 6 (9), and so by IH 


O p(n) < Ap(dp(m)) + Ih(A") — k, 
and hence e < |d,| + lh(@’) — k, so y < |d| + Ih(@’). 
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4.4. Underivability of Transfinite Induction. 
THEOREM. Transfinite induction up to €9 is underivable in Z, i.e. 


ZY (Va.Vy~<x Py — Px) > VaPx 


with a relation symbol P, and for k > 3 transfinite induction up to wp+41 ts 
underivable in Zz, 1.e., 


Zp 4 (Va NVy~a Py > Px) — VarXw,41 Px. 


PROOF. We restrict ourselves to the second part. So assume that trans- 
finite induction up to wz41 is derivable in Z,. Then by the embedding 
of Z;, into Z° and the normal derivability of the progressiveness of P in 
Z°® + Prog(P) with finite depth we can conclude that Va<w,41 Px is deriv- 
able in Z* + Prog(P) with depth < w+! and cutrank < k. (Note that here 
we need k > 3, since the formula expressing transfinite induction up to wp41 
has level 3). Now & — 1 applications of Lemma 4.1 yield a derivation of the 
same formula Vr~<w,41 Px in Z~ + Prog(P) with depth y < i 
and cutrank < 1. 

Hence there is also a quasi-normal derivation of Py + 37 in Z° + 
Prog(P) with depth y+ 2 and cutrank < 1, of the form 


ab 
< Wri 


d 
Var<wp41Px d 
Cy +31 ~ wey 2 Plyt+3! Uy + 3) ~% wer 


Ply + 31 


where d’ is a deduction of finite depth (it may even be an axiom, depend- 
ing on the precise choice of axioms for Z); this contradicts the lemma just 
proved. O 


4.5. Normalization for Arithmetic is Impossible. The normaliza- 
tion theorem for first-order logic applied to one of our arithmetical systems 
Z is not particularly useful since we may have used in our derivation induc- 
tion axioms of arbitrary complexity. Hence it is tempting to first eliminate 
the induction schema in favour of an induction rule allowing us to conclude 
VaA(x) from a derivation of A(0) and a derivation of A(Sz) with an ad- 
ditional assumption A(x) to be cancelled at this point (note that this rule 
is equivalent to the induction schema), and then to try to normalize the 
resulting derivation in the new system Z with the induction rule. We will 
apply Gentzen’s Theorems on Underivability and Derivability of Transfinite 
Induction to show that even a very weak form of the normalization theorem 
cannot hold in Z with the induction rule. 


THEOREM. The following weak form of a normalization theorem for Z 
with the induction rule is false: “For any derivation d? with free assumption 
variables among 4 for formulas A, B of level < | there is a derivation (d*)®, 
with free assumption variables contained in those of d, which contains only 


formulas of level < k, where k depends on | only.” 


PrRooF. Assume that such a normalization theorem holds. Consider the 
formula 
(VaVy~x Py > Px) > VaX<wn+1 Px 
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expressing transfinite induction up to wp+1, which is of level 3. By Gentzen’s 
Theorems on Derivability of Transfinite Induction it is derivable in Z. Now 
from our assumption it follows that there exists a derivation of this formula 
containing only formulas of level < k, for some k independent of n. Hence 
Z; derives transfinite induction up to wn+1 for any n. But this clearly 
contradicts theorem above (Underivability of Transfinite Induction). 
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